I recently upgraded from 2.2.4 to 2.3.2 and yesterday to 2.3.3 (CentOS
4.7). Initially I was experiencing significant load averages after the
2.3.2 upgrade, primarily with the zenhub process consuming an entire
processor (it seemed to settle down after manually killing the
zenmodeler process. the stop command did not kill the process). This
caused heartbeat time-outs all over the place. After the upgrade to
2.3.3, the processor load has settled down, however, I am seeing
multiple daemons fail due to a segmentation fault. It looks like the
zenwin process may be at the heart of the problem, causing cascading
failures in the zenmodeler and eventually the zenhub process.
In have to run the processes in the foreground to see any of these
problems. (Using "zenmodeler start" to deamonize will create a process
that is unresponsive, without logging anything.)
[zen...@server log]$ zenwin run -v10
DEBUG:zen.zenwin:run
DEBUG:zen.zenwin:Connecting to localhost:8789
DEBUG:zen.zenwin:Logging in as admin
INFO:zen.zenwin:Connected to ZenHub
DEBUG:zen.zenwin:setting up services EventService,
Products.ZenWin.services.WmiConfig
DEBUG:zen.zenwin:chaining getInitialServices with d2
DEBUG:zen.zenwin:callback after getting service EventService
DEBUG:zen.zenwin:callback after getting service
Products.ZenWin.services.WmiConfig
DEBUG:zen.zenwin:Queueing event {'severity': 0, 'component': 'zenwin',
'agent': 'zenwin', 'summary': 'started', 'manager': 'localhost',
'device': 'localhost', 'eventClass': '/App/Start'}
DEBUG:zen.zenwin:Calling connected.
INFO:zen.zenwin:Setting configCycleInterval to 30
DEBUG:zen.zenwin:Loading classes ['Products.ZenModel.MinMaxThreshold']
DEBUG:zen.thresholds:Updating threshold ('zeneventlog cycle time',
('localhost', ''))
DEBUG:zen.thresholds:Updating threshold ('zenmodeler cycle time',
('localhost', ''))
DEBUG:zen.thresholds:Updating threshold ('zenperfsnmp cycle time',
('localhost', ''))
DEBUG:zen.thresholds:Updating threshold ('zenping cycle time',
('localhost', ''))
DEBUG:zen.thresholds:Updating threshold ('zenprocess cycle time',
('localhost', ''))
DEBUG:zen.thresholds:Updating threshold ('zenwin cycle time',
('localhost', ''))
DEBUG:zen.thresholds:Updating threshold ('zenwinmodeler cycle time',
('localhost', ''))
INFO:zen.zenwin:Starting zenwin
DEBUG:zen.zenwin:Queueing event {'severity': 0, 'component': 'zenwin',
'agent': 'zenwin', 'summary': 'Starting zenwin', 'manager': 'localhost',
'device': 'localhost', 'eventClass': '/App/Start'}
INFO:zen.zenwin:Scanning winserver.domain.com
DEBUG:zen.WMIClient:connect to XX.XX.XX.XX, user 'DOMAIN\\zenoss-account'
Segmentation fault
If I move the faulty device to a Maintenance state, I get the same error
on the next device.
All of this was in preparation for migrating to a new 64-bit server
install. Without a working 32 bit version I don't want to move a faulty
install from one version to the next.
--
James D. Roman
Sr. Network Administrator
Science Systems and Application, Inc.
Phone: 301-867-2101
_______________________________________________
zenoss-users mailing list
[email protected]
http://lists.zenoss.org/mailman/listinfo/zenoss-users