Using NRG to gather stats

NRG is a front end to rrdtool which is a development of MRTG, essentially it's a script which uses snmpwalk and snmpget to pukll data via snmp and store it in an rrdtool database, the script then generates html and gif files to display graphs of the data. The script is configured using a metaconfiguration file and possibly a number of configuration files for specifig hosts.

To install add perl-Time-HiRes, rrdtool-1.0.33-2nrg and nrg-0.99.14-1dcs_nrg to your rpmcfg files, you'll also need to add something like

#NRG stuff
< Directory /usr/local/apache/web/nrg>
 Options ExecCGI
< /Directory>
AddHandler cgi-script .cgi
< Files "*.gif">
 ExpiresActive On
 ExpiresDefault M5
< /Files>
to /usr/local/apache/conf/www.conf in order to enable cgi and set the gifs to autoupdate. Secondly change NRG_WEB_TITLE in /usr/local/nrg/Makefile to reflect your web page. Finally you need to create NRG.mconf and .conf files, the NRG.mconf file looks something like

Example 4. NRG.mconf file.

#
# $Id: NRG.mconf.in,v 1.38 2001/04/12 14:34:34 rader Exp $
#

define(APACHE_D,/usr/local/nrg/bin/nrg-discover-apache -debug)
define(BIND_D,/usr/local/nrg/bin/nrg-discover-bind -debug)
define(ERROR_D,/usr/local/nrg/bin/nrg-discover-bind -debug)
define(IFACES_D,/usr/local/nrg/bin/nrg-discover-ifaces -debug)
define(PINGD_D,/usr/local/nrg/bin/nrg-discover-pingd -debug)
define(SENDMAIL_D,/usr/local/nrg/bin/nrg-discover-sendmail -debug)
define(SNMPD_D,/usr/local/nrg/bin/nrg-discover-snmpd -debug)
define(TABLE_D,/usr/local/nrg/bin/nrg-discover-tables -debug)
define(TCP_D,/usr/local/nrg/bin/nrg-discover-tcp -debug)
define(SOMESWITCH_IFACES,"SomeSwitch's Network Interface Data Table")
define(SOMESWITCH_ERRORS,"SomeSwitch's Network Interface Errors Data Table")
define(SITE_PING,"Jupiter's Ping Data Table")
define(SOMESERVER_TCP,"Jupiter's TCP Service Response Time Table")

# do NOT use trailing slash... it's tickles
# a nasty bug which hasn't been fixed yet...
WebRootDir[*]:       /usr/local/httpd/html
NRGSubDir[*]:        nrg
ConfFiles:           *.conf
BucketMconfTargets:  yes
HashBucketSize:      0
RunScript:           run-nrg

Directory:         /net/traffic
SomeSwitch:        IFACES_D public@1.2.3.4
SomeSwitch-iface:  TABLE_D -title SOMESWITCH_IFACES /net/traffic/SomeSwitch

Directory:         /net/errors
SomeSwitch:        ERROR_D public@1.2.3.4
SomeSwitch-err:    TABLE_D -title SOMESWITCH_ERRORS /net/errors/SomeSwitch

#Directory:        /apache
#SomeWebServer:    APACHE_D 1.2.3.5
Basically leave all the defines alone, for each host you want to gather info on define a directory to store the files in and the scripts you want to run to gather that info.

In the case of the lm_sensors details there are no autodiscovery scripts and for the moment we have to generate .conf files for each host. If you want the disk usage stats you can get these via SNMPD_D. Each .conf file looks like:

#
# callisto-temp.conf - graph temp stats for Callisto
#

# .*-s\d+$ matches *-s0, *-s1, ...

Variable[Callisto-temp][cputemp]:    enterprises.ucdavis.255.3 GAUGE
Variable[Callisto-temp][systemp]:    enterprises.ucdavis.255.4 GAUGE
Variable[Callisto-temp][othertemp]:    enterprises.ucdavis.255.5 GAUGE
Variable[Callisto-temp][fanspeed]:    enterprises.ucdavis.255.6 GAUGE
YLabel[Callisto-temp]:               Centigrade
Units[Callisto-temp]:                %sDegC
CalcDef[Callisto-temp][sfanspeed]:   fanspeed,200,/
Graph[Callisto-temp][cputemp]:       red LINE2
Graph[Callisto-temp][systemp]:       blue LINE2
Graph[Callisto-temp][othertemp]:     green LINE2
Graph[Callisto-temp][sfanspeed]:     black LINE2
Label[Callisto-temp][cputemp]:       "CPU temp"
Label[Callisto-temp][systemp]:      "System temp"
Label[Callisto-temp][othertemp]:      "Some other temp"
Label[Callisto-temp][sfanspeed]:      "Scaled fanspeed (rpm/200)"
LowerLimit[Callisto-temp]:           30
#------------------------------------------------------------------

System[Callisto-temp]:            sommunity@callisto.dcs.ed.ac.uk
RRD[Callisto-temp]:               /jupiter/Callisto/callisto-temp.rrd
GraphWebPage[Callisto-temp]:      /jupiter/Callisto/callisto-temp.cgi

PageTitle[Callisto-temp]: Callisto temp data
PageTop[Callisto-temp]:   CPU and System temp data for Callisto
PageBody[Callisto-temp]:
  < TABLE>
    < TR> < TD>System:< /TD> < TD>Callisto CPU and system temps.< /TD>< /TR>
  < /TABLE>
< BR>
PageBottom[Callisto-temp]: Callisto CPU and system Temps.
It should be fairly clear what's going on, bear in mind that CalcDef uses reverse polish, more detailed documentation is at the NRG website. Once you've got the configuration files sorted out run make rediscover and nrg should go about building up the .conf files, do make notify to build required directories, rrd files and to build run-nrg which is the script which does the information gathering.

Finally set up cron to run run-nrg periodically (say every 5 minutes) and wait about 20 min or so for the .rrd files to fill up with info.