What might help is knowing the data types you are using. First, do they
match the data type returned by snmpwalk? 

I suspect that you are measuring a counter. First understand that each
record in the counter or derive RRD database is a measure over time,
usually seconds (value per second). So RRD will hold the very first
value and will not store any values until there is a second consecutive
value recorded. It will then subtract value 1 from value 2 and divide by
the number of seconds that have elapsed. 

There are several limits that you can place an RRD database. There is a
max value, a min value, and a heart beat (maximum number of seconds
since last recorded value). You can view the limits configured for the
rrd file by using "rrdtool info rrdfilename". If a value falls outside
of any of these limits, you will get a "nan" value. On a 32 bit counter,
there is an implicit max at 2^32. At this point, the counter wraps. It
is implicitly expected that a counter will only roll forward, so if the
value is lower than the last, the expectation is that the counter
wrapped in the time period. If your counter wraps and exceed the last
value in the time period between Zenoss samples (default is 5 minutes)
than you will exceed the maximum value of 2^32. 

Sometimes issues occur because of the format of the data returned by an
snmpget. For example, if the snmp value drops the decimal point and
returns 7700 instead of 77.00, then on a rapidly moving counter, you
will reach the maximum value quicker than is reasonable. In these
situations, using a rpn expression to transform the value is recommended
(in this case "100,/"). 

Note that the cycle time is saved inside the rrd file (step value in
rrdtool info), so once it is created, it will only record data at that
rate. (So changing the SNMP Performance Cycle Interval after the rrd
file is created will not have the desired effect.) You need to delete
the original rrd file, or create a new data point in zenoss. 

At face value, a 5 second cycle time will not scale well. A number of
OS, network and hardware related latency limitations (CPU Usage, hard
disk sync times, network saturation) make it impractical for large scale
implementation. Remember, the cycle time is how often a query is
initiated, not how often the responses will be received. If you really
need that kind of real time response, a lighter weight monitoring
architecture is recommended.


On Tue, 2007-09-25 at 08:41 +0000, richardbowden wrote:
> I am new to this whole network monitoring lark but have just installed zenoss 
> 2.0.6 because I want to get a near real time (say 5 seconds latency) 
> graphical representation of some statistics coming from a custom MIB over a 
> network.  
> 
> I have my MIB working fine, I can poll it using snmpwalk and it returns good 
> values.  I have imported my MIB into zenoss which sees all the OIDs and names 
> just fine, I have created data types and data points in the template for one 
> of my end nodes running the MIB daemon, I have created 5 graphs to show 5 
> different data types.  The first two ran ok for about half and hour, then 
> stopped.  The next two have never shown any data, the last one runs fine.  I 
> queried the RRD database which shows 'nan's for all the values which don't 
> show on the graph so I can't fault the graph itself, it appears RRD is not 
> getting the right values.  But if I query snmpwalk using an OID copy/pasted 
> from zenoss it works fine.  Why would this be going wrong for 4 out of 5 
> graphs?
> 
> On a maybe related note, for the first couple of days of running this, the 
> graphs (I'm on the default CPU utilization etc graphs now) would sometimes 
> cut out.  The only way I found to start them again was run 
> #/etc/init.d/zenoss restart .  Again the RRD Database would be full of 'nan's 
> in those blank periods.  Is this likely related?
> 
> On a different note, I want to change the update rate to about 5 seconds.  I 
> changed the 'SNMP Performance Cycle Interval' value for the localhost to 5 
> seconds, pressed save and restarted zenoss for good measure and nothing 
> happened.  I changed all the other values on that page to 5 seconds too and, 
> not suprisingly, nothing happened.  I had a quick look at the code and, 
> although I don't know any Python, it appears to me that it sets a variable 
> perfsnmpCycleInterval to '5 * 60' by default so I assume it is not managing 
> to set the variable to my value.  Any ideas why not?
> 
> Thanks for any help.  I have had much more success than I had with OpenNMS 
> but I am just not quite where I need to be yet.
> 
> Richard
> 
> ------------------------
>  Richard Bowden
> 
> 
> 
> 
> -------------------- m2f --------------------
> 
> Read this topic online here:
> http://community.zenoss.com/forums/viewtopic.php?p=11054#11054
> 
> -------------------- m2f --------------------
> 
> 
> 
> _______________________________________________
> zenoss-users mailing list
> [email protected]
> http://lists.zenoss.org/mailman/listinfo/zenoss-users
-- 
James D. Roman
IT Network Administration

Terranet Inc.On contract to:
Science Systems and Applications, Inc.

_______________________________________________
zenoss-users mailing list
[email protected]
http://lists.zenoss.org/mailman/listinfo/zenoss-users

Reply via email to