I don't have the answer, but you may want to contemplate in the future doing something like a set of shorter runs saving the interim results
for i in 1 2 3 4 ... XX do mkdir Safety runsp_lapw -hf ... -i 3 -NI rm Safety/*bro* mv *bro* Safety save -f -d Safety cp Safety/*bro* ./ ; cp Safety/*.scf ./ done (It would be easier if save_lapw had an option to not delete the *bro* files and retain case.scf -- a simple hack.) On Thu, May 18, 2017 at 12:27 PM, Luis Ogando <lcoda...@gmail.com> wrote: > Dear Gavin, > > Thank you very much for your answer. > I am using Wien2k 14.2 and, unfortunately, that was the only message I > got from the standard output file (queuing system). The error files and > case.dayfile have no useful information. > The interruption was during the hf execution, after lapw1, that > finished without a problem. > It was not the first time I had to restart the calculation due to a shut > down. In the other cases, I restarted the calculation from scratch, but, > with a non parallel calculation, I have to solve this reinitialization issue > or the calculation will never end. So, I would be glad if someone else could > give me another hint. > Thank you again. > All the best, > Luis > > > > > 2017-05-18 11:35 GMT-03:00 Gavin Abo <gs...@crimson.ua.edu>: >> >> Sorry, those code line numbers are for WIEN2k 16.1. For example, if you >> are using WIEN2k 14.2, the line numbers should be 998 instead of 1354 and >> 1006 instead of 1365 in SRC_hf/calc_h.F. >> >> >> On 5/18/2017 8:19 AM, Gavin Abo wrote: >> >> Unfortunately, I think that error message can tell you "why" the >> calculation stopped, but it might not tell you the initial "cause" of it. >> That is likely because the issue that caused it happened earlier in the >> calculation (perhaps lapw1?). The vector file size is smaller than the >> vectorhf_old. I'm not sure if they should be the same size or not. If so, >> perhaps you need to restart the calculation in the lapw1 step (-s lapw1) to >> regenerate the vector file instead of starting with the hf step (-s hf), >> which I believe comes later in the calculation from that of lapw1, or you >> might just have to start the calculation over from scratch. >> >> In SRC_hf/calc_h_2.F, you should see: >> >> line 1354: >> !_COMPLEX call >> zheev('V','U',nbf,ham,nbf,enknew,workdiag,2*nbf-1,rworkdiag,info) >> >> line 1365: >> if (info .ne. 0) then >> print *, 'info=', info >> stop 'error in calc_h_2: info not equal to 0' >> endif >> >> From the code above, you can see that there likely should be a little more >> error information available from the "print *, 'info=', info" statement that >> you did not report. I believe this should have been printed to the standard >> output (terminal or std output file if you are using a queuing system). >> >> Depending on the value of the info variable, the calculation seems to have >> stopped because it encountered an illegal value or there was a convergence >> problem [1]: >> >> INFO is INTEGER >> = 0: successful exit >> < 0: if INFO = -i, the i-th argument had an illegal value >> > 0: if INFO = i, the algorithm failed to converge; i >> off-diagonal elements of an intermediate tridiagonal >> form did not converge to zero. >> >> Perhaps, the software developers of the hf code have further insight than >> I currently do into what could resolve the problem. >> >> [1] >> http://www.netlib.org/lapack/explore-html/df/d9a/group__complex16_h_eeigen_ga70c041fd19635ff621cfd5d804bd7a30.html#ga70c041fd19635ff621cfd5d804bd7a30 >> >> On 5/18/2017 5:52 AM, Luis Ogando wrote: >> >> I do not know if it is relevant, but my calculation is complex (-c). >> Thank you again, >> Luis >> >> >> 2017-05-18 8:29 GMT-03:00 Luis Ogando <lcoda...@gmail.com>: >>> >>> Dear Wien2k community, >>> >>> I am trying to calculate the dielectric function for wurtzite GaP >>> using -hf and -so as previously discussed ( >>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/msg14603.html >>> ). >>> There was a shut down of the machine during the hf execution in the >>> first step of the calculation ( run_lapw -hf ... ). When the machine came >>> back, I removed the case.vectorhf (case.vectorhf_old is still there) and >>> case.energyhf. Then, I executed >>> >>> run_lapw -hf -NI -s hf -ec 0.0001 -cc 0.0001 -i 200 >>> >>> trying to restart the calculation (non-parallel execution due to the HF x >>> SO issue discussed in the previous messages above). >>> The calculation restarted without a problem, but when the the >>> case.vectorhf reached 187MB (less than a half of the expected size, see >>> below) I got an error. >>> >>> -rw-r--r-- 1 luisoda luisoda 187M Mai 18 03:51 GaPwurtHSE-DielSO-1.vector >>> -rw-r--r-- 1 luisoda luisoda 187M Mai 18 00:14 >>> GaPwurtHSE-DielSO-1.vectorhf >>> -rw-r--r-- 1 luisoda luisoda 565M Abr 23 21:33 >>> GaPwurtHSE-DielSO-1.vectorhf_old >>> >>> The only related error message I found it was: >>> >>> error in calc_h: info not equal to 0 >>> >>> I am probably making a mistake when restarting the calculation and I >>> would really appreciate any help with this issue. >>> Many thanks in advance. >>> All the best, >>> Luis >> >> >> >> _______________________________________________ >> Wien mailing list >> Wien@zeus.theochem.tuwien.ac.at >> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien >> SEARCH the MAILING-LIST at: >> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html >> > -- Professor Laurence Marks "Research is to see what everybody else has seen, and to think what nobody else has thought", Albert Szent-Gyorgi www.numis.northwestern.edu ; Corrosion in 4D: MURI4D.numis.northwestern.edu Partner of the CFW 100% program for gender equity, www.cfw.org/100-percent Co-Editor, Acta Cryst A _______________________________________________ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html