This is a page to collect people's experiences and tips for using Cacti. Cacti really appears to be the simplest and nicest damn way to drive RRDToolfor server monitoring, so it's worth taking some time to figure out the bits that remain somewhat obscure...
Monitoring server uptime is a fairly simple task, it would seem, but it's more complex than you might think, and it provides some interesting background in issues for setting up Cacti.
This tutorial will work through creating a template and then creating an instance of that template for a particular host:
Create a data template
Create a CDEF formula
Create a graph template
Create a graph for a particular host
Information that you need to know before starting is that uptime is available through SNMP as ticks (we assume here they are 100ths of a second) and the MIB to request is "HOST-RESOURCES-MIB::hrSystemUptime.0". YMMV, especially if you're not looking at a Linux system...
Click on "Data Templates". At the RHS of the header above the list that comes up, click on "Add".
Give it a title, like "ucd/net - Uptime (ticks)". You should also use that for the "Name" field.
The option to "Use per data source value" lets you ultimately override this for each individual data source that this is used as a template for. We ain't going to need that for _any_ of these fields, so ignore all of them.
The "Data Input Method" we are going to use is to "Get SNMP Data". We only want a single value, so we don't want the "... (indexed)" alternative which is used for (e.g.) getting the traffic across all network interfaces.
The default is for all RRAs to be selected. You probably want to leave it that way, to leave you with appropriately degrading-over-age data.
The "Step" is the frequency of each data point. For Uptime I think 300 seconds is OK, but it could perhaps be higher. Each data collection point is some overhead, and they all add up, so you should think hard before reducing this below five minutes.
"Data Source Active" should be checked. You can use this inidvidually on host data sources templated from this to disable collection for some reason, but it's silly to template it to anything other than "Active"
We are only going to have a single Data Source Item in this source. Other sources may return multiple values, ordered according to some sort of index field, but that isn't appropriate for this.
So the item needs a name - we'll call it "uptime".
The uptime can't be negative, so the lowest reasonable value is 0. Anything less than that will be considered bogus data and will be ignored.
We would like to believe our uptime can be pretty large. Hopefully an unsigned int4, so we'll set the maximum to a rather optimistic 4000000000.
The data source type in this case is a GAUGE. We don't measure the difference from one query to the next (which is what COUNTER) would give us, and we're not going to derive something else from it (although maybe that's an alternative way to do what we're going to do in a minute...).
The only field we need to enter anything into here is the "OID" field because we expect all of the other fields to be provided from the host information. We don't need to say to use a per data-source item either, because it will use the _host_ data, not a per data source customisation.
The OID in this case should be the one we mentioned above: HOST-RESOURCES-MIB::hrSystemUptime.0
Since the uptime is measured in ticks, and while that's really interesting, we want to see it in decimal days because it's generally more useful to us. For Linux this means dividing the number by 8640000, so we will create a formula to do this.
For some reason, known only to the Cricket developers, these are available in a sub-menu that expands when we click on "Graph Management". Click on "CDEFs" and then "Add". First we have to give it a name, so we'll call ours "Ticks to Days" and then click "Create" to get the record there.
The formula is stack based, so we will add three "CDEF items":
The current database value.
A constant 8640000.
The division operator.
Well, go on and do it then...
Right, now you've done the formula that we will use in the graph, you can create the graph.
Choose "Graph Templates" and click on "Add" (top right) again.
We'll need a name, so lets call this "ucd/net - Uptime".
The title should be something like "|host_description| - Uptime" so that the host description gets stuffed in there when it is instantiated.
Everything else is pretty much the defaults. Override the size if you prefer them larger or smaller (but you can tweak this stuff later, so don't get too fussy at this point).
You might like to set the vetical label to "days", since that's what we're hoping to have displayed there...
Save what you have so far, and then two new sections appear at the top called "Graph Template Items" and "Graph Item Inputs".
We'll add two of these:
The actual uptime, as a plotted value.
The current value, as text at the bottom of the graph.
You can also get bonus points for adding the average and maximum at the bottom too.
Click "Add". Choose the data source template that we added above from the drop-down list.
Pick a colour.
The "Graph Item Type" is going to be "AREA", although it could be "LINE" if you prefer.
The "Consolidation Function" is going to be "AVERAGE". This is true in almost all cases, and it defines how the values in a range of time periods will get squished together when we are displaying wider and wider time periods and so we are losing detail.
The "CDEF Function" will be the one we just created earlier...
And we ignore the remaining fields and click "Save".
We're now back at the graph template page, so we click "Add" again.
We again choose the data source template that we added above from the drop-down list.
The "Graph Item Type" is going to be "GPRINT".
The "Consolidation Function" in this case is going to be "LAST" because in any resolution loss we will still want to be displaying the most recent value.
The "CDEF Function" will again be the "Ticks to Days" one we created earlier.
The "Text Format" field should be "Current:".
Click "Save" to save it.
Now we are back in the Graph Template and we can click "Save" to save that as well, since we are basically completed. Remember, we can come back later and tweak colours and so forth and it will apply across all graphs that use this template. Cool.
Odds are if you have fiddled with Cacti enough to be able to get to the start of this tutorial that you already know all this, but it's included here for completeness...
Assuming that you already have a host, and all of their SNMP stuff is working, lets add a new graph to that host for their uptime.
Go to the "Devices" section and choose the server you want to add the Uptime graph to.
On that server, in the "Associated Graph Templates", choose the "ucd/net - Uptime" option that we have added and click the "Add" button. The graph will now appear in the list as "Not being graphed".
Click on the "Create Graphs for this Host" link at the top of the page and the Graph Templates will then be listed with a checkbox next to the ones that are not currently being graphed. Check the box and click "Create" to create them.
Wait a bit longer.
After about 10 minutes you should be able to go to the graphs for your host and see the uptime there.
Thanks you and good night...
Andrew McMillan, www.catalyst.net.nz