Dear Prof. Marks, Thank you very much for your help ! Unfortunately, I would like to understand why the -s option, designed to restart a calculation at the same point where it crashed, does not work. Without this, I am afraid that even your suggestion will not help. Thank you again, Luis
2017-05-18 14:39 GMT-03:00 Laurence Marks <l-ma...@northwestern.edu>: > I don't have the answer, but you may want to contemplate in the future > doing something like a set of shorter runs saving the interim results > > for i in 1 2 3 4 ... XX > do > mkdir Safety > runsp_lapw -hf ... -i 3 -NI > rm Safety/*bro* > mv *bro* Safety > save -f -d Safety > cp Safety/*bro* ./ ; cp Safety/*.scf ./ > done > > (It would be easier if save_lapw had an option to not delete the *bro* > files and retain case.scf -- a simple hack.) > > On Thu, May 18, 2017 at 12:27 PM, Luis Ogando <lcoda...@gmail.com> wrote: > > Dear Gavin, > > > > Thank you very much for your answer. > > I am using Wien2k 14.2 and, unfortunately, that was the only message I > > got from the standard output file (queuing system). The error files and > > case.dayfile have no useful information. > > The interruption was during the hf execution, after lapw1, that > > finished without a problem. > > It was not the first time I had to restart the calculation due to a > shut > > down. In the other cases, I restarted the calculation from scratch, but, > > with a non parallel calculation, I have to solve this reinitialization > issue > > or the calculation will never end. So, I would be glad if someone else > could > > give me another hint. > > Thank you again. > > All the best, > > Luis > > > > > > > > > > 2017-05-18 11:35 GMT-03:00 Gavin Abo <gs...@crimson.ua.edu>: > >> > >> Sorry, those code line numbers are for WIEN2k 16.1. For example, if you > >> are using WIEN2k 14.2, the line numbers should be 998 instead of 1354 > and > >> 1006 instead of 1365 in SRC_hf/calc_h.F. > >> > >> > >> On 5/18/2017 8:19 AM, Gavin Abo wrote: > >> > >> Unfortunately, I think that error message can tell you "why" the > >> calculation stopped, but it might not tell you the initial "cause" of > it. > >> That is likely because the issue that caused it happened earlier in the > >> calculation (perhaps lapw1?). The vector file size is smaller than the > >> vectorhf_old. I'm not sure if they should be the same size or not. If > so, > >> perhaps you need to restart the calculation in the lapw1 step (-s > lapw1) to > >> regenerate the vector file instead of starting with the hf step (-s hf), > >> which I believe comes later in the calculation from that of lapw1, or > you > >> might just have to start the calculation over from scratch. > >> > >> In SRC_hf/calc_h_2.F, you should see: > >> > >> line 1354: > >> !_COMPLEX call > >> zheev('V','U',nbf,ham,nbf,enknew,workdiag,2*nbf-1,rworkdiag,info) > >> > >> line 1365: > >> if (info .ne. 0) then > >> print *, 'info=', info > >> stop 'error in calc_h_2: info not equal to 0' > >> endif > >> > >> From the code above, you can see that there likely should be a little > more > >> error information available from the "print *, 'info=', info" statement > that > >> you did not report. I believe this should have been printed to the > standard > >> output (terminal or std output file if you are using a queuing system). > >> > >> Depending on the value of the info variable, the calculation seems to > have > >> stopped because it encountered an illegal value or there was a > convergence > >> problem [1]: > >> > >> INFO is INTEGER > >> = 0: successful exit > >> < 0: if INFO = -i, the i-th argument had an illegal value > >> > 0: if INFO = i, the algorithm failed to converge; i > >> off-diagonal elements of an intermediate tridiagonal > >> form did not converge to zero. > >> > >> Perhaps, the software developers of the hf code have further insight > than > >> I currently do into what could resolve the problem. > >> > >> [1] > >> http://www.netlib.org/lapack/explore-html/df/d9a/group__ > complex16_h_eeigen_ga70c041fd19635ff621cfd5d804bd7a30.html# > ga70c041fd19635ff621cfd5d804bd7a30 > >> > >> On 5/18/2017 5:52 AM, Luis Ogando wrote: > >> > >> I do not know if it is relevant, but my calculation is complex (-c). > >> Thank you again, > >> Luis > >> > >> > >> 2017-05-18 8:29 GMT-03:00 Luis Ogando <lcoda...@gmail.com>: > >>> > >>> Dear Wien2k community, > >>> > >>> I am trying to calculate the dielectric function for wurtzite GaP > >>> using -hf and -so as previously discussed ( > >>> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac. > at/msg14603.html > >>> ). > >>> There was a shut down of the machine during the hf execution in > the > >>> first step of the calculation ( run_lapw -hf ... ). When the > machine came > >>> back, I removed the case.vectorhf (case.vectorhf_old is still there) > and > >>> case.energyhf. Then, I executed > >>> > >>> run_lapw -hf -NI -s hf -ec 0.0001 -cc 0.0001 -i 200 > >>> > >>> trying to restart the calculation (non-parallel execution due to the > HF x > >>> SO issue discussed in the previous messages above). > >>> The calculation restarted without a problem, but when the the > >>> case.vectorhf reached 187MB (less than a half of the expected size, see > >>> below) I got an error. > >>> > >>> -rw-r--r-- 1 luisoda luisoda 187M Mai 18 03:51 > GaPwurtHSE-DielSO-1.vector > >>> -rw-r--r-- 1 luisoda luisoda 187M Mai 18 00:14 > >>> GaPwurtHSE-DielSO-1.vectorhf > >>> -rw-r--r-- 1 luisoda luisoda 565M Abr 23 21:33 > >>> GaPwurtHSE-DielSO-1.vectorhf_old > >>> > >>> The only related error message I found it was: > >>> > >>> error in calc_h: info not equal to 0 > >>> > >>> I am probably making a mistake when restarting the calculation and I > >>> would really appreciate any help with this issue. > >>> Many thanks in advance. > >>> All the best, > >>> Luis > >> > >> > >> > >> _______________________________________________ > >> Wien mailing list > >> Wien@zeus.theochem.tuwien.ac.at > >> http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > >> SEARCH the MAILING-LIST at: > >> http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html > >> > > > > > > -- > Professor Laurence Marks > "Research is to see what everybody else has seen, and to think what > nobody else has thought", Albert Szent-Gyorgi > www.numis.northwestern.edu ; Corrosion in 4D: > MURI4D.numis.northwestern.edu > Partner of the CFW 100% program for gender equity, www.cfw.org/100-percent > Co-Editor, Acta Cryst A > _______________________________________________ > Wien mailing list > Wien@zeus.theochem.tuwien.ac.at > http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien > SEARCH the MAILING-LIST at: http://www.mail-archive.com/ > wien@zeus.theochem.tuwien.ac.at/index.html >
_______________________________________________ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html