Dear Peter and Michael, I receive the segmentation fault with OneAPI 2024.2 and OneAPI 2025.1 it appears already with -O1
I mentioned already some time ago: when I comment the $omp directives at lines 1649 ff. then the program runs smooth. It seems that this is an old unresolved problem, as it is mentioned in a comment by jdoumont 30/7/20 (however, it seems not to depend on the size of the calculation) Ciao Gerhard DEEP THOUGHT in D. Adams; Hitchhikers Guide to the Galaxy: "I think the problem, to be quite honest with you, is that you have never actually known what the question is." ==================================== Dr. Gerhard H. Fecher Institut of Physics Johannes Gutenberg - University 55099 Mainz ________________________________________ Von: Wien [wien-boun...@zeus.theochem.tuwien.ac.at] im Auftrag von Peter Blaha [peter.bl...@tuwien.ac.at] Gesendet: Samstag, 7. Juni 2025 20:40 An: wien@zeus.theochem.tuwien.ac.at Betreff: Re: [Wien] New findings on the lapw0 seg fault core dump error Very curious. Is "number of PW" in case.clmsum after init_lapw and after the first cycle identical ? Since this is a small case: Can you manually look at the Fouriercoefficients in clmsum. Any "huge" numbers ? Any *** numbers, After dstart, I guess none of the FK are zero. After mixer (after 1st iteration) the later ones should be zero. My guess is a problem in the libthread library of your compiler version (ifx 2025.xxx ?). The problems did not show up with previous compilers ? Am 07.06.2025 um 18:18 schrieb Michael Fechtelkord via Wien: > smiles .. no it is MgF2.. Just two atoms in a cubic cell. and it is not > dependent on the structure. It crashes for all in the first cycle using > the clmsum from the init_lapw > > Am 07.06.2025 um 17:34 schrieb Peter Blaha: >> Is this a big supercell ? >> >> The only thing I could imagine is that the number of PWs is bigger >> after dstart than after the 1st cycle. >> grep for "PW" in the clmsum files from dstart and after the 1st cycle. >> Eventually reduce number of PW until it works as a temporary fix. >> It might be a "stack" problem and I think one can increase this >> somehow, but I can't remember how. >> >> Am 06.06.2025 um 22:25 schrieb Michael Fechtelkord via Wien: >>> and a additional comment. >>> >>> >>> lapw0 crashes only in the first cycle with OMP_NUM_THREADS higher >>> than 1. When I set lapw0:1 for the first cycle (using -i 1 in >>> run_lapw) and then after the first run set it back to lapw0:8 it runs >>> without a problem for the complete scf cycle. It seems that is a >>> problem with the initial case.clmsum file (init_lapw -b -prec 1). >>> >>> >>> Am 06.06.2025 um 22:07 schrieb Michael Fechtelkord via Wien: >>>> Hello Peter, >>>> >>>> >>>> omp_lapw0 in .machines was 8. I reduced it from 8 to 4, then to 2 >>>> and finally to 1. Only in the case of omp_lapw0:1 lapw0 does not crash. >>>> >>>> omp_global:2 >>>> >>>> >>>> Best regards, >>>> >>>> Michael >>>> >>>> >>>> Am 06.06.2025 um 17:59 schrieb Peter Blaha: >>>>> What was your OMP_NUM_THREADS variable ? >>>>> >>>>> Set it to 1, 2, ... and check if the error occurs again. >>>>> >>>>> Am 06.06.2025 um 14:07 schrieb Michael Fechtelkord via Wien: >>>>>> I debugged the core-dump file with gdb and using debugging symbols >>>>>> in compilation of lapw0. >>>>>> >>>>>> The debugger gave me the line which causes the coredump >>>>>> >>>>>> _---------------------------------------- >>>>>> >>>>>> Debuginfod has been enabled. >>>>>> To make this setting permanent, add 'set debuginfod enabled on' >>>>>> to .gdbinit. >>>>>> [Thread debugging using libthread_db enabled] >>>>>> Using host libthread_db library "/lib64/libthread_db.so.1". >>>>>> Core was generated by `/usr/local/WIEN2k/lapw0 lapw0.def'. >>>>>> Program terminated with signal SIGSEGV, Segmentation fault. >>>>>> >>>>>> #0 0x000000000048b89b in >>>>>> MAIN__.DIR.OMP.PARALLEL.LOOP.12.split63842.split63939 ()*at >>>>>> lapw0.F:1649* >>>>>> >>>>>> *1649 !$omp parallel do reduction(+:rhopw00,cwk,cvout) &* >>>>>> >>>>>> >>>>>> [Current thread is 1 (Thread 0x14823edbe740 (LWP 339344))] >>>>>> >>>>>> ------------------------------------ >>>>>> >>>>>> Maybe somebody has an idea how to fix it.. >>>>>> >>>>>> >>>>>> Best regards >>>>>> >>>>>> Michael >>>>>> >>>>>> >>>>>> Am 17.05.2025 um 13:48 schrieb Michael Fechtelkord via Wien: >>>>>>> Hello everybody, >>>>>>> >>>>>>> >>>>>>> I have new results considering the lapw0 crash which happens >>>>>>> partially (segmentation fault error - core dump). >>>>>>> >>>>>>> It seems that the crucial thing is the case.clmsum file. (I am no >>>>>>> expert here) But if this is somehow the key. It can produce the >>>>>>> lapw0 so it might be that it is sometimes triggering the lapw0. >>>>>>> >>>>>>> I calculated MgF2 and substituted the new generated clmsum by an >>>>>>> older one and then there was no crash. I cannot attach them >>>>>>> because the file size is too large. >>>>>>> >>>>>>> >>>>>>> I am not so into debugging, to find out why and where it happens. >>>>>>> >>>>>>> >>>>>>> Best regards, >>>>>>> >>>>>>> Michael >>>>>>> >>>>>>> >>>>>> -- >>>>>> Dr. Michael Fechtelkord >>>>>> >>>>>> Institut für Geologie, Mineralogie und Geophysik >>>>>> Ruhr-Universität Bochum >>>>>> Universitätsstr. 150 >>>>>> D-44780 Bochum >>>>>> >>>>>> Phone: +49 (234) 32-24380 >>>>>> Fax: +49 (234) 32-04380 >>>>>> Email:michael.fechtelk...@ruhr-uni-bochum.de >>>>>> Web Page:https://www.ruhr-uni-bochum.de/kristallographie/kc/ >>>>>> mitarbeiter/fechtelkord/ >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Wien mailing list >>>>>> Wien@zeus.theochem.tuwien.ac.at >>>>>> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien >>>>>> SEARCH the MAILING-LIST at: http://www.mail-archive.com/ >>>>>> wien@zeus.theochem.tuwien.ac.at/index.html >>>>> >> -- ----------------------------------------------------------------------- Peter Blaha, Inst. f. Materials Chemistry, TU Vienna, A-1060 Vienna Phone: +43-158801165300 Email: peter.bl...@tuwien.ac.at WWW: http://www.imc.tuwien.ac.at WIEN2k: http://www.wien2k.at ------------------------------------------------------------------------- _______________________________________________ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html _______________________________________________ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html