On 08/10/14 17:12, Curtis K. Larsen wrote:
I am trying to be proactive regarding scaling of RADIUS servers as we move to a 
new load-balanced environment.  The idea is to know when we are getting close 
to a threshold and need to add another VM, allocate more CPU, RAM, etc.

I've used a traffic generator to simulate authentications against our 
FreeRADIUS VM's and so I know the max. number of auths/sec. that a server can 
handle and I've seen that the server will start to reject clients when it can't 
handle the load.  So I am thinking a dashboard that graphs the auths per 
second, and pie chart that shows successful vs. failed requests with some 
alerting would allow us to preempt load/growth issues.  It seems this info 
wouldn't be too difficult to grab from syslog and graph on a web page somehow.  
I am just wondering if any of you have already done this or something like this 
that you could share before I re-invent the wheel.  Let me know.

Thanks,

Curtis Larsen
University of Utah


Hi Curtis,

We use Nagios[1] to monitor all our systems, including FreeRADIUS servers. We use a bolt-on called pnp4nagios[2] to do graphing of performance data[3] with RRD.

FreeRADIUS itself includes a virtual status server[4]. If you enable this then you can send a dummy auth packet to FreeRADIUS using radclient and it will respond with a packet containing some useful status variables, including successful and failed auths.

I've written a Nagios plugin to interrogate the RADIUS server and format the data for Nagios[5]. I've also given some example config snippets to use the plugin with Nagios. You can of course slurp the status data into any format you like. The status packet only contains packet counters since the FreeRADIUS daemon was started so the plugin has to save the results from the last check and average them over time to give a "per second" rate.

Our system gives us nice graphs like this one[6] which is showing the Access-Requests per second on the three primary RADIUS servers. The spiky behaviour on the top two graphs (which serve campus) is due to people moving around en masse on the hour, as lectures change over. The third graph (which serves residences) is pretty steady, but also shows that students never really go to sleep!

[1] http://www.nagios.org/
[2] https://docs.pnp4nagios.org/
[3] http://nagios.sourceforge.net/docs/3_0/perfdata.html
[4] http://wiki.freeradius.org/config/Status
[5] https://gist.github.com/djjudas21/cd1e7bfee44fb879855d
[6] http://imgur.com/F4xeNUm

Hope this helps,
Jonathan

**********
Participation and subscription information for this EDUCAUSE Constituent Group 
discussion list can be found at http://www.educause.edu/groups/.

Reply via email to