29.09.2015 14:57, Laurence Marks wrote:
If it happens again, one thing to ask them to check is swap usage and how much memory is cached.
...
Alternatively it was something else, a zombie, big log files or other things. Rebooting gets rid of a lot of system caches and helps
I stand for losing parallelization on that node due to unclear reason (maybe this bad swapping/caching threw away parallel options from the memory and all jobs had been sent to only one processor of the node).
I would like to know what had administrator seen in the "1" mode of top command.
Best wishes Lyudmila Dobysheva ------------------------------------------------------------------ Phys.-Techn. Institute of Ural Br. of Russian Ac. of Sci. 426001 Izhevsk, ul.Kirova 132 RUSSIA ------------------------------------------------------------------ Tel.:7(3412) 432045(office), 722529(Fax) E-mail: l...@ftiudm.ru, lyuk...@mail.ru (office) lyuk...@gmail.com (home) Skype: lyuka17 (home), lyuka18 (office) http://ftiudm.ru/content/view/25/103/lang,english/ ------------------------------------------------------------------ _______________________________________________ Wien mailing list Wien@zeus.theochem.tuwien.ac.at http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien SEARCH the MAILING-LIST at: http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html