Hello, statfs is currently serving us as a bare minimum resource monitor in that it can display the status of the nodes, the number of jobs they are running etc. Most schedulers need to rely on some kind of a resource manager to decide the scheduling of events.
What would be the best way to monitor the status of nodes and get notified on a state change? statfs periodically reads from the 'state' file on all the nodes. It can export a file 'monitor' which a scheduler would poll. 'monitor' would block on read and on a status change (for any node) monitor would return all the nodes down at that instant. Suggestions? Thanks, -- Abhishek
