LRRD - Linpro RRD



Table of Contents


Background

LRRD is a server/client pair that graph, htmlifies and optionaly warns nagios about data it gathers. It's designed to let it be very easy to graph new datasources.

The Client

lrrd-client

Lrrd-client is a small perlscript listening to port 4949 using Net::Server. It reads all the scripts in /etc/lrrd/lrrd-client.d on startup. The client accepts these commands:
list [node]
list available scripts for this node
nodes
List availbale nodes
config [script]
output configuration for [script]
fetch [script]
output script values
version
Output version string
quit
disconnect

Scripts

These scripts can be in your language of choice: bash, perl, python, C, or anything else that your system can execute. The scripts can be run in several modes, the important ones being without parameters, and with the "config"-parameter. When run with "config" as parameter, the script should output the configuration of the graph.
jo@yes:/etc/lrrd/client.d$ ./load config
graph_title Load average
graph_args --base 1000 -l 0
graph_vlabel load
graph_scale no
load.label load
load.warning 10
load.critical 120
The client supports quite a few options.
graph_title
The title of the graph, defaults to the servicename.
create_args
If set, the arguments will be passed on to rrdcreate.
graph_args
If set, the arguments will be passed on to rrdgraph.
graph_order
In witch order to draw the datasources. Can also include path aliases on the form alias=domain;host:graph.datasource. See further down for details.
graph_vlabel
Y-axis label of the graph.
graph_vtitle
Y-axis label of the graph. NOTE: Deprecated, use graph_vlabel.
graph_total
If set, summarise all the datasources' values and use the value of graph_total as a label.
graph_scale
Default on. If set, disables scaling of min/max/cur values.
graph
Set to "yes" or "no". Decides wether to draw the graph. Defaults to "yes".
update
Set to "yes" or "no". Decides wether lrrd-update should fetch data for the graph. Defaults to "yes".
host_name
Override which host name this plugin is run for. Ugly - see further down on how to do this in the client configuration files instead, which is more elegant.
{name}.label
REQUIRED. Name of the datasource. You can have many datasouces in one graph.
{name}.cdef
RPN-expression. Modify the values before graphing. See the FAQ for examples.
{name}.draw
What to draw from the data source: AREA, LINE1-3. Defaults to LINE2.
{name}.graph
Set to "no" or "yes. Decides wether to graph the data source. Defaults to yes.
{name}.max
Maximum value. If the fetched value is below "max", it will be discarded.
{name}.min
Minimum value. If the fetched value is below "min", it will be discarded.
{name}.negative
Name of field to 'mirror' on the opposite side of zero. See the FAQ for examples.
{name}.skipdraw
Disables drawing of datasource. NOTE: Deprecated - use {name}.graph instead.
{name}.type
Type of datasource, COUNTER, ABSOLUTE, DERIVE and GAUGE, defaults to GAUGE. Read "man rrdcreate" for more info.
{name}.warning
Used by lrrd-nagios. Can be a max value or a range sepereated by colon. E.g. "min:", ":max", "min:max", "max".
{name}.critical
Same as above.
{name} is limited to 19 characters, and the characters [a-zA-Z0-9_].

Without options the script should only give out {name}.value (value):

jo@yes:/etc/lrrd/client.d$ ./load
load.value 0.41

All scriptnames containing other characters than alphanumerics, "-", "_", and ".", or starting with "." will be skipped.

To run a plugin as a specific user and/or group, create a file in the plugin configuration. This file is parsed as lrrd-client starts up. It can contain the following options:

[<plugin-name>]
The following lines are for plugin-name.
user <username|userid>
Run plugin as this user
group <groupname|groupid>
Run plugin as this group
command <command>
Run command instad of plugin. "%c" will be expanded to what would otherwise have been run. E.g. "command sudo -u root %c".
env.<var> <contents>
Will cause the environment variable "var to be set to " contents" when running the plugin.
Example:
[exim_mailstats]
group mail

[cps_*]
user root

# Will cause the variable "mysqlopts" to be set...
[mysql_*]
env.mysqlopts --user foo --password fii

File locations

According to FHS, this is where you should place the files.

System package (Debian, RedHat, maybe others)

CONFDIR
/etc/lrrd/
SBINDIR
/usr/sbin/
LIBDIR
/usr/share/lrrd/
STATEDIR
/var/run/lrrd/
LOGDIR
/var/log/lrrd/

Independent install (tarball)

CONFDIR
/etc/opt/lrrd/
SBINDIR
/opt/lrrd/sbin/
LIBDIR
/opt/lrrd/lib/
STATEDIR
/var/run/lrrd/
LOGDIR
/var/log/lrrd/

The Server

The server runs a cronjob as the user lrrd every 5 minutes. The cronjob runs lrrd-update,lrrd-nagios,lrrd-graph and lrrd-html one by one. All scripts creates a lockfile in @@STATEDIR@@. Everytime a script starts, it checks if the pid in the lockfile is alive before starting.

/etc/lrrd/lrrd-server.conf

This is the configuration-file for all serverscripts.
#Configfile for lrrd-server
dbdir       /var/lib/lrrd/
htmldir     /var/www/lrrd/
logdir      /var/log/lrrd
rundir      /var/run/lrrd/

#To warn Nagios
#nsca           /usr/bin/send_nsca
#nsca_config    /etc/nagios/send_nsca.cfg
#nsca_server    nagios.server.org

#
# Edit and uncomment the following to start surveilance
#
#[machine.testdomain.org]
#  address localhost

Explaination:
dbdir
Rootdir for alle rrd-files (files go into $dbdir/$domain/)
htmldir
Where to png's and htmlfiles end up
logdir
Where to send logs
rundir
Where to put state files
htaccess
The default htaccessfile
tmpldir
Where the templates reside
nsca*
Nagios options. See seperate section
[foo.com;machine.dom.ain]
Add machine.dom.ain to domain foo.com.
[machine.dom.ain]
Add machine.dom.ain to domain dom.ain. (A short form of [dom.ain;machine.dom.ain].)
To add a new node, just put in a new section and add the address option. Domain-level options
node_order
Changes order of nodes in a domain. (Default is alphabetically sorted.)
Node-level options
address
Set the node address
port
Set node port number (default 4949)
use_node_name
Set to "yes" or "y" to force getting all the default plugins from a client. Good for hosts which changes hostname (e.g. laptops).
use_default_name
Set to "yes" or "y" to force getting all the default plugins from a client. Good for hosts which changes hostname (e.g. laptops). NOTE: Deprecated. Use use_node_name instaed.
Field-level options
sum
Summarise other fields. See the FAQ for how to use this.
stack
Stack other fields. See the FAQ for how to use this.
+++
Check the client configuration (further up) for everything else.

lrrd-update

Lrrd-update reads /etc/lrrd/lrrd-server.conf, searches for nodes, and connect to the lrrd-clients using the address-field. When connected it will run the list-command to fetch available scripts, then it will run config for each script. This configuration will expand in the /etc/lrrd/lrrd-server.conf-file and rdd-databases will be created. Already expanded configuration will be skipped. Then lrrd-update runs through it's newly modified configuration file and runs fetch on all scripts.

lrrd-graph

Lrrd-graph reads /etc/lrrd/lrrd-server.conf and graphs all services unless [service].graph no. The following options are available in the configuration
limited to 19 characters
[client].graph_title
The title of the graph
[client].graph_order
Which order to graph the lines.
[client].graph_args
Extra arguments to the graph
[service].label
REQUIRED, the name of the value to be graphed,
[service].type
Type of value. COUNTER, GAUGE, defaults to GAUGE. NOTE: When GAUGE is used, only "snapshots" of every 5 minutes are recorded. Peaks in-between updates will not be graphed. When you use COUNTER, the numbers are averaged out over the past 5 minutes, so short peaks will show up as substancially lower than they were.

lrrd-html

Lrrd-html creates the html-pages for the graphs.
Usefull configuration in the server.conf file is:
node_order [node1] [node2] ....
In which order the nodes should be listed, defaults to sorted.

lrrd-nagios

Lrrd-nagios is a optional script to send a passive alert to a nagios-server. For this to work, you need a nagios-nsca server, a working send_nsca configuration and the following configuration in /etc/lrrd/lrrd-server.conf:
nsca          /usr/bin/send_nsca
nsca_config   /etc/nagios/send_nsca.cfg
nsca_server   [nsca-server] 
Then add .warning and .critical fields in your configuration or directly into you clientscripts. The value for these field can be a single maxvalue or a colonseperated range
processes.warning 10:300
processes.critical 5:500
A value lower than 10 or higher then 300 will result in a warning to nagios, a value lower than 5 or higher than 500 will result in a critical to nagios

Other usefull ranges:

[service].warning :400
is equal to:
[service].warning 400
Only warn if lower than 300:
[service].warning 300:
When a service contains .critical or .warning it will chech it's status agains the last fetched value. If it's ok, a "{service}.ok" file will be created in the $dbdir/$domain directory. If the value is not ok. This file will be removed and lrrd-nagios will update nagios every 5 minutes untill the value is ok and a new ".ok" file will be created.

File locations

According to FHS, this is where you should place the files.

System package (Debian, RedHat, maybe others)

CONFDIR
/etc/lrrd/
SBINDIR
/usr/sbin/
LIBDIR
/usr/share/lrrd/
STATEDIR
/var/run/lrrd/
LOGDIR
/var/log/lrrd/
DBDIR
/var/lib/lrrd/

Independent install (tarball)

CONFDIR
/etc/opt/lrrd/
SBINDIR
/opt/lrrd/sbin/
LIBDIR
/opt/lrrd/lib/
STATEDIR
/var/run/lrrd/
LOGDIR
/var/log/lrrd/
DBDIR
/var/opt/lrrd/