For reference, the mpi timings can be approximately represented as

Total Time ~ Const1/Cores *((Jobs per Bus)**0.6) + Const2
with Const1 ~ 1900, how much of the CPU time gets parallelized
Const2 a non-parallelizable overhead of about 60

The dual quad-core has two memory buses.

On Wed, Apr 23, 2008 at 7:17 PM, Laurence Marks
<L-marks at> wrote:
> Dual Quad-Core  E5410  @ 2.33GHz 1333 FSB
>  8GB 667MHz DDR2 FB-DIMM
>  Infiniband (more than fast enough, WALL=CPU within delta)
>  Adjacent Cache Line Prefetch Enabled (might not be best, not sure)
>  ifort 10.1.015 mkl
>  Blocksize = 64
>  Serial benchmark
>  Jobs/Node  Threads   Time
>  1                    8        53.58
>  2                    4        85.37
>  4                    2      138.03
>  8                    1      274.03
>  MPI benchmark
>  Jobs/Node  Nodes      Time
>   1                 8          296.4
>   2                 4          298.4
>   4                 2          406.2
>   4                 4          222.3
>   4                 8          150.8
>   8                 1          613.5
>   8                 8          136.8
>  Run serial, 1 Job 8 Threads 764.58 (25% slower than mpi)
>  first number: jobs/node; 2nd number: nodes
>  Run_1_1: TIME HAMILT (CPU)  = 212.5, HNS = 229.9, HORB =   0.0, DIAG =  
> 1589.8
>  Run_1_1: TOTAL CPU       TIME:   2034.4 (INIT =    2.1 + K-POINTS = 2032.3)
>  Run_1_8: TIME HAMILT (CPU)  =  31.8, HNS =  30.6, HORB =   0.0, DIAG = 231.4
>  Run_1_8: TOTAL CPU       TIME:    296.4 (INIT =    2.1 + K-POINTS =  294.3)
>  Run_2_4: TIME HAMILT (CPU)  =  32.4, HNS =  31.0, HORB =   0.0, DIAG = 232.8
>  Run_2_4: TOTAL CPU       TIME:    298.4 (INIT =    2.1 + K-POINTS =  296.4)
>  Run_4_2: TIME HAMILT (CPU)  =  34.8, HNS =  38.2, HORB =   0.0, DIAG = 331.0
>  Run_4_2: TOTAL CPU       TIME:    406.2 (INIT =    2.1 + K-POINTS =  404.1)
>  Run_4_4: TIME HAMILT (CPU)  =  18.0, HNS =  21.0, HORB =   0.0, DIAG = 181.0
>  Run_4_4: TOTAL CPU       TIME:    222.3 (INIT =    2.1 + K-POINTS =  220.2)
>  Run_4_8: TIME HAMILT (CPU)  =  10.7, HNS =  10.5, HORB =   0.0, DIAG = 127.3
>  Run_4_8: TOTAL CPU       TIME:    150.8 (INIT =    2.1 + K-POINTS =  148.7)
>  Run_8_1: TIME HAMILT (CPU)  =  39.4, HNS =  68.6, HORB =   0.0, DIAG = 503.3
>  Run_8_1: TOTAL CPU       TIME:    613.5 (INIT =    2.1 + K-POINTS =  611.5)
>  Run_8_8: TIME HAMILT (CPU)  =   7.0, HNS =   9.6, HORB =   0.0, DIAG = 117.9
>  Run_8_8: TOTAL CPU       TIME:    136.8 (INIT =    2.1 + K-POINTS =  134.8)
>  --
>  Laurence Marks
>  Department of Materials Science and Engineering
>  MSE Rm 2036 Cook Hall
>  2220 N Campus Drive
>  Northwestern University
>  Evanston, IL 60208, USA
>  Tel: (847) 491-3996 Fax: (847) 491-7820
>  email: L-marks at northwestern dot edu
>  Web:
>  Commission on Electron Diffraction of IUCR

Laurence Marks
Department of Materials Science and Engineering
MSE Rm 2036 Cook Hall
2220 N Campus Drive
Northwestern University
Evanston, IL 60208, USA
Tel: (847) 491-3996 Fax: (847) 491-7820
email: L-marks at northwestern dot edu
Commission on Electron Diffraction of IUCR

Reply via email to