Hi Sandeep, > > I have a query regarding this. > While performing serial or parallel calculations, on increasing omp > from 1 to 8 , %age use of cpu's does not increase in the same scale > (omp=2, 170to 180% , omp=4 ,300 to 330% omp=8 only 500 to 550%). > is something wrong in configuring or compiling the softwares or due > to some limitations in hardware. > Any suggestions?
There are several factors, one is the threading support in the BLAS/LAPACK libraries and another one are the deficiencies of the Wien2k OpenMP parallelization. HW also comes into play, mostly in the general sense that the lower memory bandwidth you have the earlier you will see the flattening of the speedup with more threads. If you look at the lawp1 output you can see how the total time is mostly divided in 3 parts, for example: TIME HAMILT (CPU) = 2.8, HNS = 2.9, HORB = 0.0, DIAG = 17.3, SYNC = 0.0 TIME HAMILT (WALL) = 0.7, HNS = 0.8, HORB = 0.0, DIAG = 4.7, SYNC = 0.0 scaling of DIAG part is mostly based on how your libraries scale (MKL does quite OK, but don't expect miracles). HAMILT scaling is based on explicit Wien2k parallelization. That one also doesn't scale too well past 4-6 cores. The reason is I was mostly learning OpenMP when I wrote it and I just went for the simplest "omp parallel for" solution probably at too high level (also because the support in ifort of higher OpenMP version with more advanced constructs was not so good at that time). I think that there could still be some speedup if this would be rewritten and the parallelization would happen at different level, maybe more similarly to how its parallelized with MPI so it fits better in the caches and could thus overcome the memory bandwidth limits better when scaling to more cores. HNS has no explicit threading at all and IIRC for the BLAS/LAPACK calls there the library-level threadidng didn't help much. This could be also improved by rewriting it to be more parallalization friendly (possibly again mirroring how the MPI version does it, which scales fine IIRC), but I'm not algebra expert so I haven't even tried. So yeah, no easy way how this can be improved, unless you know a bit about OpenMP and want to try yourself (BTW prof. Blaha was always very welcoming to contributions even though I'm not part of the Wien2k team :-) ). Best regards Pavel _______________________________________________ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html