One of the processes will not kill. (kill -9 <pid>, killall -9 <pid> or
killall -9 zenperfsnmp do nothing.)
I had to reboot the system yesterday to try & get the rogue processes to
die. It looks like I might have to do the same today. Any suggestions
on trying to kill the last process?
zenoss 2015 0.0 0.4 28140 2284 ? D Aug16 0:02
/usr/local/zenoss/bin/python
/usr/local/zenoss/Products/ZenRRD/zenperfsnmp.py --configfile
/usr/local/zenoss/etc/zenperfsnmp.conf --cycle --daemon
Todd M. Hebert
Eric Newton wrote:
Oh. Well, you have too many zenperfsnmp processes running. Kill them
all. Run zenperfsnmp with with strace
$ strace -f -e trace=network -s 2000 zenperfsnmp run -v 10
2>send-to-eric
Stop it when you see the errors in the log file. Then restart
zenperfsnmp as a daemon:
$ zenperfsnmp start
-Eric
Todd Michael Hebert wrote:
This is all that ended up in the file:
Process 2026 attached - interrupt to quit
Process 2026 detached
Zenoss status says the processes are running.
This is what I get from the ps auxww command:
zenoss 2027 0.0 0.4 28140 2312 ? S Aug16 0:00
/usr/local/zenoss/bin/python
/usr/local/zenoss/Products/ZenRRD/zenperfsnmp.py --configfile
/usr/local/zenoss/etc/zenperfsnmp.conf --cycle --daemon
zenoss 2028 0.0 0.4 28140 2312 ? S Aug16 0:00
/usr/local/zenoss/bin/python
/usr/local/zenoss/Products/ZenRRD/zenperfsnmp.py --configfile
/usr/local/zenoss/etc/zenperfsnmp.conf --cycle --daemon
zenoss 2015 0.0 0.4 28140 2312 ? D Aug16 0:02
/usr/local/zenoss/bin/python
/usr/local/zenoss/Products/ZenRRD/zenperfsnmp.py --configfile
/usr/local/zenoss/etc/zenperfsnmp.conf --cycle --daemon
zenoss 2026 0.0 0.4 28140 2312 ? S Aug16 0:00
/usr/local/zenoss/bin/python
/usr/local/zenoss/Products/ZenRRD/zenperfsnmp.py --configfile
/usr/local/zenoss/etc/zenperfsnmp.conf --cycle --daemon
root 8680 0.0 0.1 1828 584 pts/0 S+ 15:13 0:00 grep
zenperfsnmp
Todd M. Hebert
Eric Newton wrote:
Oops... sorry for my last content-free reply.
For some reason, zenperfsnmp isn't getting through all the devices.
This will cause heartbeat failures, and the log messages below.
Can you do this for me:
$ ps auxww | grep zenperfsnmp
Note the process id
$ strace -e trace=network -s 2000 -p [process id from above]
2>file-to-send-to-eric
Interrupt it after a minute.
zip the file and send it to me. This will let me see what packets
are going out and coming back.
This will include things like IP addresses and community strings, so
send it off list. I'm specifically looking for sendto/recvfrom on
the IP addresses that are not collecting. We should see several
attempts to talk to those devices in one minute. I will check the
integrity of the packets and verify that we are decoding them properly.
-Eric
Todd Michael Hebert wrote:
These are in a rotated-out zenperfsnmp.log:
2006-08-11 16:23:02 WARNING zen.zenperfsnmp: Deleting old RRD file:
/usr/local/zenoss/perf/B1/ifOutOctets.rrd
There's one of these, it would appear, for every single rrd.
I'm also getting these errors, and lots of them:
2006-08-14 18:59:29 WARNING zen.zenperfsnmp: Devices status is not
clearing. Restarting.
2006-08-14 19:08:17 WARNING zen.zenperfsnmp: Devices status is not
clearing. Restarting.
Right now I have 2-3 devices with events that just won't clear as
well..but everything else seems to be monitoring normally.
I'm getting very frustrated, and I really don't want to take the
whole Zenoss install down & start over. I'm monitoring 152 devices
at this point... and the system has really been great for diagnosing
network problems and knowing when anything goes down.
Todd M. Hebert
_______________________________________________
zenoss-users mailing list
[email protected]
http://lists.zenoss.org/mailman/listinfo/zenoss-users
_______________________________________________
zenoss-users mailing list
[email protected]
http://lists.zenoss.org/mailman/listinfo/zenoss-users
_______________________________________________
zenoss-users mailing list
[email protected]
http://lists.zenoss.org/mailman/listinfo/zenoss-users