Inspired by the recent posts about gfortran and openblas, I made some timing tests myself.

I was using the "Test-Case" (serial benchmark) from our website (a complex case with NMAT=3481.

I tested it on an Intel I7-3939 (6 core) processor with either ifort+mkl (2016.3.210) or gfortran+openblas.
I was using 1, 2, 4 or 6 cores (set via OMP_NUM_TRHEADS) of one PC:

1 core:
Intel     TIME HAMILT (WALL) =     5.2, HNS =     4.2, DIAG =    25.8
gfortran: TIME HAMILT (WALL) =    36.3, HNS =     4.0, DIAG =    25.0

2 cores:
Intel     TIME HAMILT (WALL) =     5.3, HNS =     2.5, DIAG =    14.4
gfortran: TIME HAMILT (WALL) =    36.3, HNS =     2.4, DIAG =    13.4

4 cores:
Intel     TIME HAMILT (WALL) =     5.3, HNS =     1.7, DIAG =     7.7
gfortran: TIME HAMILT (WALL) =    36.6, HNS =     1.7, DIAG =     7.9

6 cores:
Intel     TIME HAMILT (WALL) =     5.3, HNS =     1.5, DIAG =     6.4
gfortran: TIME HAMILT (WALL) =    36.4, HNS =     2.0, DIAG =     7.4

So obviously, the openblas is really VERY good and basically of the same quality as the MKL (if not faster !!).

But: Setting up the eigenvalue problems (HAMILT) involves the calculation of many cosines (exponentials) and we can use the "vector-cosines" from the mkl. This makes ifort in this part 7 times faster !!!! This can also be seen from the partial timing in case.output1 of the hamilt-times, where phase and us are significantly faster:

ifort
Time for al,bl    (hamilt, cpu/wall) :          0.3         0.3
Time for legendre (hamilt, cpu/wall) :          0.1         0.1
Time for phase    (hamilt, cpu/wall) :          1.1         1.3
Time for us       (hamilt, cpu/wall) :          1.2         1.2
Time for overlaps (hamilt, cpu/wall) :          2.0         1.9
Time for distrib  (hamilt, cpu/wall) :          0.1         0.0
gfortran
Time for al,bl    (hamilt, cpu/wall) :          0.2         0.3
Time for legendre (hamilt, cpu/wall) :          0.2         0.2
Time for phase    (hamilt, cpu/wall) :         25.9        25.3
Time for us       (hamilt, cpu/wall) :          6.3         6.8
Time for overlaps (hamilt, cpu/wall) :          2.8         3.0
Time for distrib  (hamilt, cpu/wall) :          0.0         0.0

This limits gfortan significantly, making it in these tests a factor of two (or, when using 4 cores a factor of 3) slower.

Anyway, the openblas is really good, and if somebody would know how to "vectorize" the cos, sin (exp) calls in gfortran this would be very valuable.

Peter Blaha

On 12/08/2016 01:51 PM, John Rundgren wrote:
Dear Arthur,

"Linker Flags" and "R_LIB" are found by consulting google on
"xianyi-openblas user manual".

The "include" flag is necessary, otherwise there is a conflict with
/usr/link/ld.

Xianyi recommends -lopenblas and adds -lpthread -lgfortran with
motivations understood by wise Linuxers. They have not done any harm.

Could you improve calculation time ...? In a previous wien-bounces you
find a test where gfortran+openblas is fully competitive with intel+mkl.
A try is worthwhile.

Best regards / John


John Rundgren
Department of Theoretical Physics, KTH Royal Institute of Technology


_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html


--

                                      P.Blaha
--------------------------------------------------------------------------
Peter BLAHA, Inst.f. Materials Chemistry, TU Vienna, A-1060 Vienna
Phone: +43-1-58801-165300             FAX: +43-1-58801-165982
Email: bl...@theochem.tuwien.ac.at    WIEN2k: http://www.wien2k.at
WWW:   http://www.imc.tuwien.ac.at/TC_Blaha
--------------------------------------------------------------------------
_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:  
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html

Reply via email to