Dear David, nice, ~30 seconds instead of ~150 :-) BTW is this already with "-DHAVE_LIBMVEC" in compiler options?
For your real workflow you might also try to experiment with the number of threads vs number of k-points in parallel (now it seems you are already running at 4 threads for the test_case that has only 1 k- point), but for small cases with lots of k-points I would expect that k-point parallelization would be the best. Now that you link with the OpenMP-threaded OpenBLAS, it is important that the number of k-points run in parallel times number of threads allowed for lapw1/lapw2 does not exceed the total number of cores. It seems your environment already has OMP_NUM_THREADS set to 4 (at least), so you need to set the OpenMP threading explicitly for Wien2k. Specifically try this .machines file (k-point parallel in lapw1/lapw2 + OpenMP elsewhere, assuming Intel(R) Xeon(R) CPU W3550 should have 4 physical cores) with run_lapw -p for your standard use-case ----------- 1:localhost 1:localhost 1:localhost 1:localhost omp_global:4 omp_lapw1:1 omp_lapw2:1 ------------ or alternativelly (two k-points and two threads in lapw1+2) ------------- 1:localhost 1:localhost omp_global:4 omp_lapw1:2 omp_lapw2:2 ------------- Best regards Pavel On Wed, 2021-11-24 at 13:55 +0100, David Holec wrote: > Dear Pavel, > > Many thanks again for your patience and guidance. With the libopenblas- > openmp-dev package it seems to work well! > > $ ldd lapw1 > linux-vdso.so.1 (0x00007ffca83d8000) > libopenblas.so.0 => /lib/x86_64-linux-gnu/libopenblas.so.0 > (0x000014563f924000) > libgfortran.so.5 => /lib/x86_64-linux-gnu/libgfortran.so.5 > (0x000014563f65c000) > libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 > (0x000014563f50d000) > libmvec.so.1 => /lib/x86_64-linux-gnu/libmvec.so.1 > (0x000014563f4e1000) > libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 > (0x000014563f49f000) > libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 > (0x000014563f47c000) > libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 > (0x000014563f288000) > /lib64/ld-linux-x86-64.so.2 (0x0000145641b2d000) > libquadmath.so.0 => /lib/x86_64-linux-gnu/libquadmath.so.0 > (0x000014563f23e000) > libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 > (0x000014563f223000) > libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 > (0x000014563f21d000) > > and these options in siteconfig: > L Linker Flags: $(FOPT) -L/usr/lib/x86_64-linux- > gnu/openblas-openmp > R R_LIBS (LAPACK+BLAS): -lopenblas > > I also get better timings now (though the TIME HAMILT are slightly > longer, but overall improvement): > $ ../x lapw1 > STOP LAPW1 END > 119.400u 1.937s 0:32.53 372.9% 0+0k 0+37864io 0pf+0w > $ grep HORB *output1* > test_case.output1: TIME HAMILT (CPU) = 17.3, HNS = 18.4, > HORB = 0.0, DIAG = 85.0, SYNC = 0.0 > test_case.output1: TIME HAMILT (WALL) = 4.6, HNS = 5.2, > HORB = 0.0, DIAG = 22.0, SYNC = 0.0 > > Thanks for your help, > David > --- > Dr David Holec > Computational Materials Science group > Department of Materials Science > Montanuniversität Leoben > > > > Franz-Josef-Strasse 18, A-8700 Leoben, Austria > tel. +43-(0)3842-4024211 > fax. +43-(0)3842-4024202 > materials.unileoben.ac.at > cms.unileoben.ac.at > ________________________________ > WHERE RESEARCH MEETS FUTURE > > > On Wed, 24 Nov 2021 at 12:39, Pavel Ondračka <[email protected]> > wrote: > > Hi David, > > > > well, it is hard to say without the debug info why the OpenBLAS > > crahes. > > My guess is that you link with the 64bit interface, try to install > > the > > standard one (openblas-openmp-devel) and replace openblas64-openmp > > with > > openblas-openmp everywhere in you config. Also remove the -lpthread > > (just to be safe, but in theory should not matter), it is not needed > > with OpenMP. If it still crashes, please recompile with debug info > > enabled (add -g to compiler options) and send me the x lapw1 output > > via > > PM. > > > > BTW my response was mostly motivated by me suspecting you actually > > link > > against slow netlib BLAS (which turned out to be the case) and I > > wanted > > to warn others in case someone in the future would be using your > > settings as a reference :-) > > > > Best regards > > Pavel > > > > On Wed, 2021-11-24 at 11:09 +0100, David Holec wrote: > > > Hi Pavel, > > > > > > Many thanks for your insights. As you know, I am not an expert on > > how > > > to compile codes, for me, this is sadly a trial and error > > adventure. > > > > > > I tried to compile it against the openblas library, but although > > the > > > compilation ends without any errors, I get a segmentation fault > > when > > > running lapw1 (on the test case > > > from http://www.wien2k.at/reg_user/benchmark/). The current setting > > > are: > > > > > > L Linker Flags: $(FOPT) -L/usr/lib/x86_64-linux- > > > gnu/openblas64-openmp > > > R R_LIBS (LAPACK+BLAS): /usr/lib/x86_64-linux-gnu/openblas64- > > > openmp/libopenblas64.so.0 -lpthread > > > > > > > > > (The rest is as I wrote in my first email.) Here is the list of > > > linked libraries: > > > $ ldd lapw1 > > > linux-vdso.so.1 (0x00007ffea57d6000) > > > libopenblas64.so.0 => /lib/x86_64-linux- > > > gnu/libopenblas64.so.0 (0x000014fe2b2e5000) > > > libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 > > > (0x000014fe2b2c2000) > > > libgfortran.so.5 => /lib/x86_64-linux-gnu/libgfortran.so.5 > > > (0x000014fe2affa000) > > > libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 > > > (0x000014fe2aeab000) > > > libmvec.so.1 => /lib/x86_64-linux-gnu/libmvec.so.1 > > > (0x000014fe2ae7f000) > > > libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 > > > (0x000014fe2ae3d000) > > > libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 > > > (0x000014fe2ac49000) > > > /lib64/ld-linux-x86-64.so.2 (0x000014fe2d4d3000) > > > libquadmath.so.0 => /lib/x86_64-linux-gnu/libquadmath.so.0 > > > (0x000014fe2abff000) > > > libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 > > > (0x000014fe2abe4000) > > > libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 > > > (0x000014fe2abde000) > > > > > > And here is the stacking fault (it doesn't tell me anything): > > > $x lapw1 > > > > > > Program received signal SIGSEGV: Segmentation fault - invalid > > memory > > > reference. > > > > > > Backtrace for this error: > > > #0 0x148fb48b8d21 in ??? > > > #1 0x148fb48b7ef5 in ??? > > > #2 0x148fb452d20f in ??? > > > #3 0x148fb54495fb in ??? > > > Segmentation fault > > > 0.949u 0.472s 0:00.84 167.8% 0+0k 14400+10992io 31pf+0w > > > > > > Any idea? > > > > > > With the settings I reported yesterday, everything works just fine > > > (though probably not very efficiently - but this is not a problem > > for > > > me as this binary is not a "production" binary on any HPC): > > > > > > $ ldd lapw1 > > > linux-vdso.so.1 (0x00007ffc60765000) > > > libblas.so.3 => /lib/x86_64-linux-gnu/libblas.so.3 > > > (0x0000150cec90f000) > > > liblapack.so.3 => /lib/x86_64-linux-gnu/liblapack.so.3 > > > (0x0000150cec26b000) > > > libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 > > > (0x0000150cec248000) > > > libgfortran.so.5 => /lib/x86_64-linux-gnu/libgfortran.so.5 > > > (0x0000150cebf80000) > > > libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 > > > (0x0000150cebe31000) > > > libmvec.so.1 => /lib/x86_64-linux-gnu/libmvec.so.1 > > > (0x0000150cebe05000) > > > libgomp.so.1 => /lib/x86_64-linux-gnu/libgomp.so.1 > > > (0x0000150cebdc1000) > > > libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 > > > (0x0000150cebbcf000) > > > libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 > > > (0x0000150cebbb4000) > > > /lib64/ld-linux-x86-64.so.2 (0x0000150cec9fc000) > > > libquadmath.so.0 => /lib/x86_64-linux-gnu/libquadmath.so.0 > > > (0x0000150cebb6a000) > > > libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 > > > (0x0000150cebb64000) > > > > > > and: > > > > > > $ x lapw1 > > > STOP LAPW1 END > > > 162.045u 0.918s 2:31.25 107.7% 0+0k 8976+37856io 35pf+0w > > > $ grep HORB *output1* > > > test_case.output1: TIME HAMILT (CPU) = 16.3, HNS = > > 20.0, > > > HORB = 0.0, DIAG = 125.9, SYNC = 0.0 > > > test_case.output1: TIME HAMILT (WALL) = 4.4, HNS = > > 20.0, > > > HORB = 0.0, DIAG = 126.0, SYNC = 0.0 > > > > > > (I am using it on my 10-year old "type writer": > > > $lscpu > > > Architecture: x86_64 > > > CPU op-mode(s): 32-bit, 64-bit > > > Byte Order: Little Endian > > > Vendor ID: GenuineIntel > > > CPU family: 6 > > > Model: 26 > > > Model name: Intel(R) Xeon(R) CPU > > W3550 > > > @ 3.07GHz > > > ) > > > > > > > > > --- > > > Dr David Holec > > > Computational Materials Science group > > > Department of Materials Science > > > Montanuniversität Leoben > > > > > > > > > > > > Franz-Josef-Strasse 18, A-8700 Leoben, Austria > > > tel. +43-(0)3842-4024211 > > > fax. +43-(0)3842-4024202 > > > materials.unileoben.ac.at > > > cms.unileoben.ac.at > > > ________________________________ > > > WHERE RESEARCH MEETS FUTURE > > > > > > > > > On Wed, 24 Nov 2021 at 08:27, Pavel Ondračka > > > <[email protected]> wrote: > > > > Hi David, > > > > > > > > as you said it works for you, so feel free to ignore, but I have > > > > some > > > > further tips if you are interested. Ubuntu switches between the > > > > different blas and lapack using the "alternatives", so its > > > > difficult > > > > to > > > > say if you actually link with the correct one. > > > > > > > > "ldd lapw1" in WIENROOT should show which one is actually linked, > > > > what > > > > you want to have is the openmp openblas > > > > /usr/lib/x86_64-linux-gnu/openblas-openmp/libblas.so > > > > /usr/lib/x86_64-linux-gnu/openblas-openmp/liblapack.so > > > > or alternatively > > > > /usr/lib/x86_64-linux-gnu/openblas-openmp/libopenblas.so > > > > It looks like you linked with the pthread one. This is not a > > > > problem > > > > when running at single thread but at higher thread number this > > > > might > > > > lead to oversubscription and slowdowns as the pthreaded openblas > > > > doesn't respect the OMP_NUM_THREADS set by Wien2k. So I would > > > > recommend > > > > to relink with the openmp OpenBLAS. BTW it is usually safer to > > link > > > > with OpenBLAS explicitly using the -lopenblas instead of the - > > > > llapack > > > > - > > > > lblas to be sure you don't accidentally link the netlib one > > > > (libopenblas is just the libblas and libblapack provided by > > > > OpenBLAS > > > > merged together). > > > > > > > > In general easy way how to check performance is to run the serial > > > > test_case from http://www.wien2k.at/reg_user/benchmark/ On modern > > > > CPUs > > > > (at least avx2) the runtime should be around 15-25 seconds at > > > > single > > > > thread. > > > > > > > > I see total runtime of ~18seconds on Fedora 35 with gfortran > > 11.2.1 > > > > OpenBLAS and AMD Ryzen 9 3900X 12-Core Processor. > > > > Also look for the following line in test_case.output1, this is > > what > > > > I > > > > have: > > > > TIME HAMILT (WALL) = 2.2, HNS = 1.7, HORB = 0.0, DIAG > > > > = > > > > > > > > 14.0, SYNC = 0.0 > > > > The time in HAMILT mostly depends on you compiler and vectorizing > > > > settings, while the DIAG is 99% lapack/blas related, so this can > > > > help > > > > with the diagnostics if things are slow. > > > > > > > > You might also get extra speedup of the HAMILT part by adding "- > > > > DHAVE_LIBMVEC" to the Compiler options. > > > > > > > > Best regards > > > > Pavel > > > > > > > > On Tue, 2021-11-23 at 11:07 +0100, David Holec wrote: > > > > > Dear all, > > > > > > > > > > I have just spent some time making Wien2k run on my single > > > > > machine > > > > > running Ubuntu 20.04 with gfortran/gcc. Since I am not an > > expert, > > > > it > > > > > was a trial and error, but it seems that I found a working > > > > combination > > > > > (sadly, the default parameters didn't work for me). Maybe this > > > > > will > > > > > help someone. Here are the settings that did the job for me: > > > > > > > > > > M OpenMP switch: -fopenmp > > > > > O Compiler options: -ffree-form -O2 -ftree-vectorize > > - > > > > > march=native -ffree-line-length-none -ffpe-summary=none > > > > > L Linker Flags: $(FOPT) -L/usr/lib/x86_64-linux- > > > > > gnu > > > > > P Preprocessor flags '-DParallel' > > > > > R R_LIBS (LAPACK+BLAS): -lblas -llapack -lpthread > > > > > F FFTW options: -DFFTW3 -DFFTW_OMP - > > I/usr/include > > > > > FFTW-LIBS: -L/usr/lib/x86_64-linux-gnu - > > > > > lfftw3 > > > > - > > > > > lfftw3_omp > > > > > > > > > > where the FFTW options were: > > > > > > > > > > R FFTWROOT: /usr/ > > > > > V FFTW_VERSION: FFTW3 > > > > > L FFTW_LIB: lib/x86_64-linux-gnu > > > > > N FFTW_LIBNAME: fftw3 > > > > > > > > > > Compiler versions: > > > > > $ gcc --version > > > > > gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 > > > > > gfortran --version > > > > > GNU Fortran (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 > > > > > > > > > > And I used the generic lapack and openblas packages provides by > > > > Ubuntu > > > > > repos: > > > > > liblapack-dev/focal,now 3.9.0-1build1 amd64 [installed] > > > > > liblapack3/focal,now 3.9.0-1build1 amd64 [installed,automatic] > > > > > > > > > > liblapack64-3/focal,now 3.9.0-1build1 amd64 > > [installed,automatic] > > > > > liblapack64-dev/focal,now 3.9.0-1build1 amd64 [installed] > > > > > > > > > > libblas-dev/focal,now 3.9.0-1build1 amd64 [installed] > > > > > libblas3/focal,now 3.9.0-1build1 amd64 [installed,automatic] > > > > > libblas64-3/focal,now 3.9.0-1build1 amd64 [installed,automatic] > > > > > libblas64-dev/focal,now 3.9.0-1build1 amd64 > > [installed,automatic] > > > > > > > > > > libopenblas64-0/focal-updates,now 0.3.8+ds-1ubuntu0.20.04.1 > > amd64 > > > > > [installed] > > > > > libopenblas64-0-openmp/focal-updates,now 0.3.8+ds- > > > > > 1ubuntu0.20.04.1 > > > > > amd64 [installed] > > > > > libopenblas64-0-pthread/focal-updates,now 0.3.8+ds- > > > > > 1ubuntu0.20.04.1 > > > > > amd64 [installed,automatic] > > > > > > > > > > (I am not totally sure if I need all the libraries above, but > > > > > certainly, with these, the compilation seems to work and I am > > > > > able > > > > to > > > > > run SCF cycles & Telnes calculations without errors :-) > > > > > > > > > > All the best, > > > > > David > > > > > --- > > > > > Dr David Holec > > > > > Computational Materials Science group > > > > > Department of Materials Science > > > > > Montanuniversität Leoben > > > > > > > > > > > > > > > > > > > > Franz-Josef-Strasse 18, A-8700 Leoben, Austria > > > > > tel. +43-(0)3842-4024211 > > > > > fax. +43-(0)3842-4024202 > > > > > materials.unileoben.ac.at > > > > > cms.unileoben.ac.at > > > > > ________________________________ > > > > > WHERE RESEARCH MEETS FUTURE > > > > > _______________________________________________ > > > > > Wien mailing list > > > > > [email protected] > > > > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > > > > SEARCH the MAILING-LIST at: > > > > > > > > > > > http://www.mail-archive.com/[email protected]/index.html > > > > > > > > _______________________________________________ > > > > Wien mailing list > > > > [email protected] > > > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > > > SEARCH the MAILING-LIST at: > > > > > > http://www.mail-archive.com/[email protected]/index.html > > > _______________________________________________ > > > Wien mailing list > > > [email protected] > > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > > SEARCH the MAILING-LIST at: > > > > > http://www.mail-archive.com/[email protected]/index.html > > > > _______________________________________________ > > Wien mailing list > > [email protected] > > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > > SEARCH the MAILING-LIST at: > > http://www.mail-archive.com/[email protected]/index.html > _______________________________________________ > Wien mailing list > [email protected] > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: > http://www.mail-archive.com/[email protected]/index.html _______________________________________________ Wien mailing list [email protected] http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/[email protected]/index.html

