From the top's sent before, it looks like the administrators might have
configured the system with no swap:
r1i1n2
Swap: 0M total, 0M used, 0M free, 10563M cached
r1i1n3
Swap: 0M total, 0M used, 0M free, 23089M cached
Keep in mind that having swap might mean the difference between hurt
performance and a hard crash under low memory [
http://unix.stackexchange.com/questions/190398/do-i-need-swap-space-if-i-have-more-than-enough-amount-of-ram
].
On 9/29/2015 5:57 AM, Laurence Marks wrote:
If it happens again, one thing to ask them to check is swap usage and
how much memory is cached. On some of my nodes I have noticed that
they do not always release cached memory, and can start swapping. If
this happens the job will get very slow. The commands to use to clear
the cache can be found at
http://www.tecmint.com/clear-ram-memory-cache-buffer-and-swap-space-on-linux/
or similar. (Needs root access.) Top can also show memory use.
While there should be no need to do this, I have noticed that I need
to do it every 3hrs on 4 nodes - the other 20 don't need it. It is an
issue mainly for big calculations.
Alternatively it was something else, a zombie, big log files or other
things. Rebooting gets rid of a lot of system caches and helps -- even
on my Android tablet every week or two. It's murky waters.
---
Professor Laurence Marks
Department of Materials Science and Engineering
Northwestern University
http://www.numis.northwestern.edu
Corrosion in 4Dhttp://MURI4D.numis.northwestern.edu
<http://MURI4D.numis.northwestern.edu>
Co-Editor, Acta Cryst A
"Research is to see what everybody else has seen, and to think what
nobody else has thought"
Albert Szent-Gyorgi
_______________________________________________
Wien mailing list
Wien@zeus.theochem.tuwien.ac.at
http://zeus.theochem.tuwien.ac.at/mailman/listinfo/wien
SEARCH the MAILING-LIST at:
http://www.mail-archive.com/wien@zeus.theochem.tuwien.ac.at/index.html