-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Sounds like a nasty problem … In terms of strategy, I think the first thing should be to find out if the node is really to blame. If so, you have to convince the admins and/or find a way to avoid it. If not, you can turn to figuring out whatever else (presumably in your Wien2k setup) is causing the trouble.
On 09/24/2015 07:37 PM, Luis Ogando wrote: > First of all, I wonder: To what extent is this problem > reproducible? E.g., does your job always run on the same 4 nodes? > > Yes. > > Is it always the same node(s) that are slow? > > Yes It seems unusual that your job should always be assigned the same nodes, but okay. If you get your job to run on a different set it could help establish if the node is really to blame. In some queuing systems, you can request specific nodes. Or you could submit two copies of your job. > The strangest part: at the beginning of this month, the same > calculation was running properly. I had a crash for convergence > problems and when I reduced the "mixing factor" in case.inm (it is > now 0.04 in pre-convergence scf cycle) the problems started. > Obviously, I do not believe that the mixing factor is the problem. > > No. All the executables are running slowly in the problematic > node. I would try to widen the tests then -- restart the calculation from scratch, try a different case, try other programs … > Users can do nothing. The administrator sent me the "top's" and I > have asked him for simultaneous ones. Like I said, even if you have no direct access you can put it in a job script. Something along these lines (in bash): run & pid=$(jobs -p %1) while [[ "$(jobs)" ]]; do for n in $NODES; do ssh $n top -bn1 >>$n.top # plus whatever else you want to check done done wait Elias -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Icedove - http://www.enigmail.net/ iQIcBAEBAgAGBQJWBP9EAAoJEE/4gtQZfOqPHfkQALvFqdz2yL5CGbVH7c7klkoo UT3vR6W+3Ev6in9Ed/z/KOc09m8j2hFrZ0p32jW9EF78jfiObFKaaNVkbHJLpw8l 6ru8AEVBxdNIeCJp53aakILSboRx/GzRnTHdZMyjj8EGfEng+0+fPG2+xm+OWipU Nsreceb/n+gwJvZTKTn719xushxAM9JSUmSMPrN3WESH4nEgm3wFeR/FuPFyoqfZ S3RNb0CYd8tB3bs0MP4lYFbHWVeiQVy0j2uOwoiqjfqkSlC1vvJoxnBXO900ybvX AaIRRXGcmd8XiTaQfD/VPvZX0R3Un1swee4EI0LcMNxiYFGkvuN0p7lMd5MC5Zny 7h+IeXIMH9QNtlWF4HDr7stMAYSeKxKLhTWlddJgIOXrXGPF9BHHJsY/X3LwUIYF E8UzP061j1LNVwDMUIOYYBX4UCIQJfMpnW3PvbTJIIq56NE3Z6ppxV4ZMAkK2JBo HRmdtQX8pSCXJaggu7QbAIzdhH4Eat+YoEgBAo6uj1M4tYjZ1GivNlwBO2ItQFTu Y5JCrWILBKloCEym4TDezcwCR0R2/4cUKkXQlgQUh+iLVrKCG2QkAYnJwSxzdIDe q19gOQEU5MrUCHtH1vaUTYE+Oq4Z0UNWhKiGRapBgJNFYnRonqzKywqOciWt2SmU JV7fZo5W2vviyEW/e9TF =eXD9 -----END PGP SIGNATURE----- _______________________________________________ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://email@example.com/index.html