-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Daniel,
you can also try using the 64-bit counters from your SNMP device. These are available via the ifXTable portion of the IF-MIB. See this link for more information: http://www.cisco.com/en/US/tech/tk648/tk362/technologies_q_and_a_item09186a00800b69ac.shtml HTH, - -=Tom Nail Daniel McKinney wrote: > > Folks, > > I learned something very valuable in Zenoss over the past few weeks > about how Interface speeds are graphed and i decided to share. :) > > OK, first of all, what i learned. > > It looks like my GigabitEthernet links report their speeds as a rate in > Octets(which i imagine all links report it that way). > [EMAIL PROTECTED] root]# snmpwalk -v2c -c public ASHF-6509-01 > 1.3.6.1.2.1.2.2.1.10.116 > IF-MIB::ifInOctets.116 = Counter32: 4027999182 > So i am still a little shady on the math involved, but i think i > understand the theory now. > > A port's speed is measured in counters. So basically, if an snmpget on > the OID for your IF-MIB::ifInOctets.xxx catches the port at the > beginning of the counter, then again at the end of the counter, it can > calculate the rate just fine. But, if the interval between the first > capture and the second capture (5 minute default in Zenoss) span across > 2 counter rollovers, Zenoss gets REALLY confused. I tried to draw what > i mean below, i hope it helps explain the theory. > > Speed_rate > > So anyway, long story short, when i tried to plot my GigabitEthernet > links, i would get the correct data in the offtimes when the rate was > pretty slow, but during the day, the data was ALL over the place. Also, > the actual data was off by roughly a factor of 8. So being a noob to > this stuff, i went into the ifInOctets and ifOutOctets data points > inside the data sources and changed the RPN from "8,*" to "64,*" to get > the data to print even remotely near the rate at which the link was > actually showing via CLI. See below for a sample of that mess: > wrong_graph > So you can see that during the off hours, it is a nice gradual > decline/incline. But during the day... all over the place. > > *The fix was to change the interval from 5 minute polling time to a 1 > minute polling time. > *Now, my concern was, if i change the global interval from 5 minutes to > 1 minute, wont that tax the box Zenoss is running on alot more seeing as > i have about 120 hosts and roughly 60 data points being graphed on each > box? I found this post on the message board that Eric Newton stated > about that: > /*Quote:*/ / / > / / / //2. I'm interested in process monitoring, but it seems that process > status is only updated during the SNMP cycle *interval*. By default that > / > /is 5 minutes, which for our customers would be much too long. How short / > /an *interval* can I set before the SNMP polling becomes a performance > issue? > > > / / / > > / > The only real limit is once-per-second because RRD files cannot store > data points more frequently that that. > > Monitors -> Performance Monitors -> localhost -> SNMP Cycle Time can be > changed to whatever you need. Most modern servers can scan 2-10 > machines per second, if you are looking for a "best lowest number". So, > if you want to scan 100 machines, you will need 10 - 50 seconds. > Increased memory improves disk cache performance, which is the primary > bottleneck. > > You can create alternative performance configurations (the next version > of the GUI will let you do that via the web interface). This allows you > to spread your collectors over multiple machines, or to have different > collection cycles for different types of collectors and devices. > > Once you increase your collection rate, remove your existing RRD files > so that they are re-created to use the smaller step size. / > > Interestingly enough, once i changed the interval to 1 minute instead of > 5 minutes, the rate at which the link was reporting SHOT up to a factor > of 8 PAST what it should be, as you can see in the graph below: > wrong_graph_after > > > So, i went back into the RPN for the data points for ifInOctets and > ifOutOctets and changed "64,*" back to "8,*" and the resulting > graph(after i deleted the .rrd files) is a close to correct > representation of the speeds on the CLI and also ALOT more stable. It > still looks a little choppy, which i would assume would be that the > counter is still rolling over once in a minute and the next interval has > to try to catch up every once in a while, but nowhere near as bad. I > may have to drop it down to 30 seconds, but i am pretty skiddish about that. > correct_graph > > > So as you can see, im a happy camper. Now my next question that i am > going to ask the mailing list on a separate email to the Zenoss > developers, is is there any way to set the interval for a particular > device or even better, to a particular datasource? That way, i dont > have to poll my memory/CPU % for each server at 1 minute intervals when > 5 minutes is fine for those. > > Hope this helps some other folks having similar problems, > > -Daniel > > > > > > ------------------------------------------------------------------------ > > _______________________________________________ > zenoss-users mailing list > [email protected] > http://lists.zenoss.org/mailman/listinfo/zenoss-users -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.1 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFF+DG21zsLRjt/NQ0RAnT+AJ9Uml7gah3TkFDce1eJYm5JMLNSAQCfXFn5 0m4kIFex3vLCVuhiLckN4ss= =K2Zl -----END PGP SIGNATURE----- _______________________________________________ zenoss-users mailing list [email protected] http://lists.zenoss.org/mailman/listinfo/zenoss-users
