By any chance is the time it stops around 4-4:30AM when the backups are being 
done. I am starting to see zenperfsnmp croak around this time. The I/O Wait for 
the system goes off the roof and zenperfsnmp juse doesn't recover from the 
problem devices. It then just stops in the middle of a cycle with no errors.
A strace on the pid reports this...
recvmsg(66,
66 in this case seems to be a UDP port for a specific machine.
When trying to restart normally it reports this...
recvmsg(66, 0x7fff1b2cefa0, 0)          = ? ERESTARTSYS (To be restarted)
--- SIGTERM (Terminated) @ 0 (0) ---
rt_sigaction(SIGTERM, {0x3eacebbab0, [], SA_RESTORER, 0x317d20dd40}, 
{0x3eacebbab0, [], SA_RESTORER, 0x317d20dd40}, 8) = 0
rt_sigreturn(0xf)                       = -1 EINTR (Interrupted system call)
recvmsg(66, 0x7fff1b2cefa0, 0)          = ? ERESTARTSYS (To be restarted)
--- SIGTERM (Terminated) @ 0 (0) ---
rt_sigaction(SIGTERM, {0x3eacebbab0, [], SA_RESTORER, 0x317d20dd40}, 
{0x3eacebbab0, [], SA_RESTORER, 0x317d20dd40}, 8) = 0
rt_sigreturn(0xf)                       = -1 EINTR (Interrupted system call)
recvmsg(66,  <unfinished ...>

An lsof shows around 71 UDP ports open by the process.

It is hung in this state. Then when killed with a -USR1 (I was trying different 
signals before -9) it died. Restarting via the gui, I get machines not being 
collected as reported in "Recovery Issues" thread;
http://community.zenoss.com/forums/viewtopic.php?t=3920

When I started from the command line, it started up just fine.

zenperfsnmp needs to be able to recover from this.




-------------------- m2f --------------------

Read this topic online here:
http://community.zenoss.com/forums/viewtopic.php?p=14576#14576

-------------------- m2f --------------------



_______________________________________________
zenoss-users mailing list
[email protected]
http://lists.zenoss.org/mailman/listinfo/zenoss-users

Reply via email to