[HOWTO] 0.8.7 and 1 minute polling

If you figure out how to do something interesting/cool in Cacti and want to share it with the community, please post your experience here.

Moderators: Developers, Moderators

Post Reply
User avatar
gandalf
Developer
Posts: 22383
Joined: Thu Dec 02, 2004 2:46 am
Location: Muenster, Germany
Contact:

Post by gandalf »

toni75 wrote:I have a question, if i do this, all my historical data will can be wiped out ?
No, they WILL DEFINITIVELY BE WIPED OUT.
R.
yuval_ba
Posts: 32
Joined: Mon Oct 13, 2008 6:19 am

Re: RRA and Data Template Settings for 0.8.7

Post by yuval_ba »

tekbot wrote:Rows: The Rows Value defines the number of Steps that each RRA should hold. This defines "the width of the rolling window", or in other words, the amount of time old data will be kept in the RRA before it is dropped off. As mentioned above, my settings may well differ from what would work for you. The settings above define the following for each polling interval:

Code: Select all

Interval / View / Rows / Storage Duration
10s / 24h / 25920 / 72h
1m / 24h / 4320 / 72h
5m / 24h / 864 / 72h

10s / 7d / 44640 / 31d
1m / 7d / 44640 / 31d
5m / 7d / 8928 / 31d

10s / 1mo / 25920 / 90d
1m / 1mo / 25920 / 90d
5m / 1mo / 25920 / 90d

10s / 1y / 35040 / 3y
1m / 1y / 35040 / 3y
5m / 1y / 35040 / 3y
I'll break down the 24 hour view and leave the rest of the math to you guys. For the 10 second poller 24 hour view, I want to keep 3 days worth of data at 10s granularity. So, the question is "How many steps are there in 72 hours?" The answer is ROWS. So, we have 6 polls per minute, 60 minutes per hour times 72 hours, or (6 * 60 * 72), or 25920. For the 1 minute poller 24 hour view I want 3 days worth of data at 1m granularity, so that equation is going to be (1 * 60 * 72) or 4320. For the 5 minute poller 24 hour view I want 3 days worth of data at 5m granularity so (12 * 72) = 864 (12 polls per hour times 72 hours).
Hi tekbot,
First of all, thanks for your great post!, it helped me allot to configure my 1min interval polling.
I still have one question regarding Rows calculation.

I understand the math behind Rows calculation, but I did not undersand why you choose longer periods than the period represented by Timespan value.

For example: In your 1min poll / 24 hours view, why did you choose Rows value of 4320 which is equal to 3 days if due to the Timespan value only 24hours will be displayed in the graph? my common sense would be to choose Rows value of 1440 in this case. what am I missing?

thanks
Yuval
eduardosilvestre
Posts: 3
Joined: Wed May 25, 2011 2:09 pm

Re: [HOWTO] 0.8.7 and 1 minute polling

Post by eduardosilvestre »

Hello,

how can i put "5 min average pool in 1 year view"? The space used is not relevant and the important thing is to have the most detailed level in history. :)

Best Regards,
noname
Cacti Guru User
Posts: 1566
Joined: Thu Aug 05, 2010 2:04 am
Location: Japan

Re: [HOWTO] 0.8.7 and 1 minute polling

Post by noname »

>> how can i put "5 min average pool in 1 year view"?

It needs:
(365days * 24hours * 60min) / 5min = 105120 pixel width...
eduardosilvestre
Posts: 3
Joined: Wed May 25, 2011 2:09 pm

Re: [HOWTO] 0.8.7 and 1 minute polling

Post by eduardosilvestre »

Where and how I can change these settings?
noname
Cacti Guru User
Posts: 1566
Joined: Thu Aug 05, 2010 2:04 am
Location: Japan

Re: [HOWTO] 0.8.7 and 1 minute polling

Post by noname »

>> Where and how I can change these settings?

R... really? :o

Graph Templates -> (any template) --> Width
eduardosilvestre
Posts: 3
Joined: Wed May 25, 2011 2:09 pm

Re: [HOWTO] 0.8.7 and 1 minute polling

Post by eduardosilvestre »

Hello noname,

thanks for your reply. It's my first time using cacti :-?

Best Regards
Joost
Posts: 7
Joined: Wed May 19, 2010 7:55 am
Location: Amsterdam, The Netherlands

Re: [HOWTO] 0.8.7 and 1 minute polling

Post by Joost »

I went down from 5 minute to 30 second polling last year, but I still notice that the bandwidth figures generated do not come close to the actual utilization.

Is it possible to let Cacti run at an even lower interval, like 10 seconds? I wouldn't want to do this for all my graphs as I think my server would collapse, but it would be cool if I can set something like this up for only the critical circuits.
User avatar
willieb
Cacti User
Posts: 160
Joined: Thu Jan 22, 2009 10:09 am
Location: South GA

Re: [HOWTO] 0.8.7 and 1 minute polling

Post by willieb »

Ok I was able to get 1 minute polling to work on my test box fairly easy thanks to this thread mainly. I will post my steps later as maybe it will help somewhat. But I have a question.

Out of the box as the averages grow from 5 minute, 30 minute, 2 hours, and finally 1 day, I notice that the bandwidth graph's peaks are reduced greatly. So if we look back earlier in the year we were actually using much higher bandwidth than it shows.

So the question is, if I set the RRAs to all steps of 1 (increasing rows accordingly), would this eliminate the problem mentioned?

If they answer is yes then I'll configure all with step values of 1. I assume the only disadvantage would be larger rrd files as you get to yearly. I even thought about setting a 1 minute avg 3 yr span. Other than a huge rrd file, would this be an issue? I've got a half a TB available.

Thanks everyone...
-willieb
User avatar
willieb
Cacti User
Posts: 160
Joined: Thu Jan 22, 2009 10:09 am
Location: South GA

Re: [HOWTO] 0.8.7 and 1 minute polling

Post by willieb »

There's many ways to skin a cat, but here's the steps that worked for me. My goal here is to give you the simplest imstructions to get 1 minute polling to

work with the closest RRA settings as out of the box. I'm not going to explain to much since it's already here, so just read this entire thread.

If this is not a fresh install of cacti, I recommend you try this on a test installation first. I did. Here we go...


1.) Stop scheduled task (or cron). Change from 5 minutes to 1. Do not restart. (I decided I wanted 1 minute task for more control)

2.) Settings => Poller => change "Poller Interval" and "Cron Interval" to "Every Minute".

2.) Data Sources => RRA - Change current RRAs or create your own. I decided to change my current because I'm not interested in 5 minute polls. This is the hardest step (at least for me) to understand what is going on.

This is my best guess on RRA values on the minimum changes needed from out of the box 5 minute polling to 1 minute polling. "Steps" were changed to reflect the new 1 minute polling. I left "Rows" the same because they are well over the minimum calculated needed rows for 1 minute cycles. Note: I did not test these settings, see my RRA settings at the bottom.

Code: Select all

Name                                Steps          Rows          Timespan
Hourly (1 Minute Average)            1              500          14400
Daily (5 Minute Average)             5              600          86400
Weekly (30 Minute Average)           30             700          604800
Monthly (2 Hour Average)             120            775          2678400
Yearly (1 Day Average)               1440           797          33053184
3.) Backup your cacti db using phpmyadmin or stop mysql and copy the data folder, restart mysql. I did both.
Issue these commands to update step & heartbeat for all data templates:
C:\mysql\bin>mysql --user=cacti --password cacti (note: change --user=yourusername and the ending "cacti" to your db name.)
mysql> update data_template_data set rrd_step='60';
mysql> update data_template_rrd set rrd_heartbeat='120';

4.) Delete all rrd files in ./rra (in cacti website folder/rra)

5.) System Utilities => Rebuild Poller Cache

6.) Start Scheduled Task

7.) Check System Utilities => View Cacti Log File for errors and view new graphs.

Give it a few cycles and you should start seeing 1 minute cycle data. Hit Rebuild Poller Cache again if you have any issues. Post results if you try it.


Here's my actual RRA settings. I wanted 1 minute averages on all and I added a 3 year graph.

Code: Select all

Name                                Steps          Rows          Timespan
Hourly  (1 Minute Average)           1              300           14400
Daily   (1 Minute Average)           1              1500          86400
Weekly  (1 Minute Average)           1              11000         604800
Monthly (1 Minute Average)           1              45000         2678400
Yearly  (1 Minute Average)           1              526000        31536000
3 Years (1 Minute Average)           1              1576800       94608000
I hope this helps someone. Thanks.

Edit: Although 1 minute polling is very doable and it worked great, to many things are broke like gaps in the graphs, threshold doesn't work properly, etc, I went back to a fresh install with standard 5 minute polling. I opted for the realtime plugin when I needed quicker info.
Last edited by willieb on Mon Jul 25, 2011 10:16 am, edited 1 time in total.
-willieb
User avatar
willieb
Cacti User
Posts: 160
Joined: Thu Jan 22, 2009 10:09 am
Location: South GA

Re: [HOWTO] 0.8.7 and 1 minute polling

Post by willieb »

I hope the above post helps someone...

A couple more related questions.

1.) What is the most graphs anyone has using 1 minute polling with spine and separately cmd and what are your processes/threads settings?

2.) If my "Poller Interval" and "Cron Interval" are both currently at "1 Minute" and I change my "Cron Interval" in cacti and my task back to 5 minutes I shouldn't loose my current rrd files correct?

The reason I am asking is that I added a device that apparently had a problem with responding SNMP which caused the task to run past one minute, of course this caused a 1 minute gap in all the graphs. I deleted the device. Would going back to a 1 Minute Cron Interval help this?

If so how does this help since it's still polling every 60 seconds?

Thanks.
-willieb
DWAyotte
Posts: 32
Joined: Wed Mar 28, 2007 1:37 pm

Re: [HOWTO] 0.8.7 and 1 minute polling

Post by DWAyotte »

I have followed the steps detailed by tekbot and willieb but I can't seem to get my graphs working. I continue to get -nan values and have followed both sets of directions various times (excluding willieb's changing of cron to 1 minute).

This is a fresh install. I have tried removing all the graphs for localhost, removing the rrd files, rebuilding cache, resaving all the graph template items, but no luck. Poller logging level is set to HIGH and this is what I see:

Code: Select all

07/14/2011 10:50:02 AM - SYSTEM STATS: Time:1.2667 Method:spine Processes:1 Threads:1 Hosts:2 HostsPerProcess:2 DataSources:7 RRDsProcessed:6
07/14/2011 10:50:01 AM - SPINE: Poller[0] Time: 0.1791 s, Threads: 1, Hosts: 2
07/14/2011 10:50:01 AM - SPINE: Poller[0] Host[1] TH[1] DS[39] SCRIPT: perl /var/www/cacti/scripts/query_unix_partitions.pl get available /dev/mapper/Monitor-root, output: 74907580
07/14/2011 10:50:01 AM - SPINE: Poller[0] Host[1] TH[1] DS[39] SCRIPT: perl /var/www/cacti/scripts/query_unix_partitions.pl get used /dev/mapper/Monitor-root, output: 2702116
07/14/2011 10:50:01 AM - SPINE: Poller[0] Host[1] TH[1] DS[38] SCRIPT: perl /var/www/cacti/scripts/unix_processes.pl, output: 120
07/14/2011 10:50:01 AM - SPINE: Poller[0] Host[1] TH[1] DS[37] SCRIPT: perl /var/www/cacti/scripts/unix_users.pl , output: 2
07/14/2011 10:50:01 AM - SPINE: Poller[0] Host[1] TH[1] DS[36] SCRIPT: perl /var/www/cacti/scripts/loadavg_multi.pl, output: 1min:0.05 5min:0.07 10min:0.05
07/14/2011 10:50:01 AM - SPINE: Poller[0] Host[1] TH[1] DS[35] SCRIPT: perl /var/www/cacti/scripts/linux_memory.pl SwapFree:, output: 506632
07/14/2011 10:50:01 AM - SPINE: Poller[0] Host[1] TH[1] DS[34] SCRIPT: perl /var/www/cacti/scripts/linux_memory.pl MemFree:, output: 111044
07/14/2011 10:50:01 AM - SPINE: Poller[0] Host[1] TH[1] NOTE: There are '7' Polling Items for this Host
07/14/2011 10:50:01 AM - SPINE: Poller[0] Host[1] TH[1] Host has no information for recache.
07/14/2011 10:50:01 AM - SPINE: Poller[0] Host[1] PING: Result UDP: Host is Alive
07/14/2011 10:50:01 AM - SPINE: Poller[0] NOTE: Spine is behaving in a 0.8.7g manner
07/14/2011 10:50:01 AM - SPINE: Poller[0] NOTE: Spine did not detect multithreaded device polling.
07/14/2011 10:50:01 AM - SPINE: Poller[0] Time: 0.0301 s, Threads: 1, Hosts: 1
07/14/2011 10:50:01 AM - SPINE: Poller[0] NOTE: Spine is behaving in a 0.8.7g manner
07/14/2011 10:50:01 AM - SPINE: Poller[0] NOTE: Spine did not detect multithreaded device polling.
07/14/2011 10:50:01 AM - POLLER: Poller[0] NOTE: Poller Int: '60', Cron Int: '300', Time Since Last: '300', Max Runtime '298', Poller Runs: '5'
07/14/2011 10:50:00 AM - SYSTEM STATS: Time:0.1283 Method:spine Processes:1 Threads:1 Hosts:1 HostsPerProcess:1 DataSources:7 RRDsProcessed:0
07/14/2011 10:50:00 AM - POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting.
07/14/2011 10:50:00 AM - SPINE: Poller[0] Time: 0.0335 s, Threads: 1, Hosts: 1
07/14/2011 10:50:00 AM - SPINE: Poller[0] NOTE: Spine is behaving in a 0.8.7g manner
07/14/2011 10:50:00 AM - SPINE: Poller[0] NOTE: Spine did not detect multithreaded device polling.
07/14/2011 10:50:00 AM - POLLER: Poller[0] -1310662021.9743 seconds
07/14/2011 10:50:00 AM - SYSTEM STATS: Time:238.2447 Method:spine Processes:1 Threads:1 Hosts:1 HostsPerProcess:1 DataSources:7 RRDsProcessed:0
07/14/2011 10:50:00 AM - POLLER: Poller[0] Maximum runtime of 298 seconds exceeded. Exiting. 
Anyone willing to help me work through this?

Here are the values of my RRAs

Code: Select all

Name                               Steps  Rows        Timespan
Daily (1 Minute Average) 	1 	1440 	        86400
Weekly (1 Minute Average) 	1 	10080 	604800
Monthly (1 Minute Average) 	1 	44640 	2678400
Yearly (1 Minute Average) 	1 	525600 	31536000
3 Year (1 Minute Average) 	1 	1576800 	94608000
I ran the SQL Queries to update all the data sources.

My specs:
I am on Ubuntu Natty x64 with Cacti .7g + Spine .7g + PA2.8 + Nagios 3.2.3
I have the following plugins: Monitor, Mactrack, Settings, NPC and Spikekill

Poller settings:
Type: Spine
Interval: Every Minute
Cron: Every 5 Minutes

RRA dir:
ls -l

Code: Select all

total 303576
-rw-r--r-- 1 cactiuser cacti  69077416 2011-07-14 11:00 localhost_hdd_free_39.rrd
-rw-r--r-- 1 cactiuser cacti 103615408 2011-07-14 11:00 localhost_load_1min_36.rrd
-rw-r--r-- 1 cactiuser cacti  34539424 2011-07-14 11:00 localhost_mem_buffers_34.rrd
-rw-r--r-- 1 cactiuser cacti  34539424 2011-07-14 11:00 localhost_mem_swap_35.rrd
-rw-r--r-- 1 cactiuser cacti  34539424 2011-07-14 11:00 localhost_proc_38.rrd
-rw-r--r-- 1 cactiuser cacti  34539424 2011-07-14 11:00 localhost_users_37.rrd
I am not sure what else would be of help. Thanks!
JMoMo
Cacti User
Posts: 60
Joined: Mon Nov 08, 2004 12:11 am

Re:

Post by JMoMo »

tekbot wrote:
gandalf wrote:No, sorry. No chance to change step size using rrdtool tune or stuff. Using official stuff only, it is NOT POSSIBLE to convert a 5 min absed rrd file to 1 min based.
If you don't trust me :wink: please as at rrdtool-users mailing list
Reinhard
Guys,
I have written a script that reformats old 5 minute RRD files into 1 minute RRD files. I ran this against an infrastructure and it worked a charm. I will attach the scripts below. There is some information you need to make this work, and you need to be familiar navigating around Cacti. I'll be as verbose as I can to ease you through this.

(PS - I don't want any code monkeys telling me how sub-optimal this script is. Feel free and improve on it, but don't tell me it's "bad code", because as far as I'm aware, I'm the only one who's successfully done this -- Ever). :)

The script has a few parts, which should be explained here.

GetAndConvert - this is the wrapper script. This script takes one argument, which should be your INPUT_FILE described below. There are constants defined at the top of the script. This is where you tell the script important information about your environmeent like where to find the RRD files to convert, and where to dump them when they're done. Modify this to suit your needs. Also, since I built my 0.8.7 on a new machine, I used SCP to get the files from a remote host. If you're not doing this, you'll need to edit the script accordingly. (I've put a note on how to get your SCP working without entering a password for each file, if you don't know how to use Public Key Auth).

INPUT_FILE - YOU create this file. This file should be a list of all of the files you want to bring over and convert. You need to strip the .rrd from the filenames for the script to work properly (or you can fix this bug -- code monkey!). An easy way to do this in vim would be

Code: Select all

 :g/\.rrd.*$/s/// 
or you could run something like this on your old rra dir

Code: Select all

 ls | awk -F. '{print $1}' > INPUT_FILE 
Whatever works for you.

FormatRRA - This is the guy that does all the work. This requires information from you as well to properly set the multipliers. Basically, what this script does is takes the raw XML of your RRD and reads through it until it finds it's first data block. We then adjust the step value in the data definition and we print every line n times (depending on your multiplier - also discussed later). In simpler terms, the first block of a 5 minute graph has a line that defines STEP, which is set to 300 (seconds). We replace that with a new STEP because we want a 1 minute graph (60 seconds). And, since our step is different, we need more data points to make up for the missing space, so we take each value and print it out 5 times. We do this for every existing line until we encounter a new STEP definition, and we then adjust our multiplier accordingly. Making sense?

Misc Notes:
All of these scripts should be put into the same directory. You should run them all as the cacti user. You need to tell your 0.8.7n installation about your new RRD formats. This can be done under Data Sources -> RRAs. I wrote a post a few months ago describing how to configure your new RRAs for optimal storage at high-granularity. That post is here: http://forums.cacti.net/viewtopic.php?t ... c&start=15. I know it's long, but Read It! If you use the format I suggest there, you do not have to edit your multipliers. The only "drawback" to using the formats as I've defined them is, I want more granular data for longer, therefor my RRD files are bigger than the default. MBs are cheap these days.

Output:
This script will churn for a good few minutes on every RRD file, and I didn't bother writing in any MySQL injection (feel free). The output will give you the perfect text string to paste into your Data Source field. Here is your basic order of operations (this is in the README in the gzip attached):

* Create your RRD Structure as defined in my post on 1-minute polling (I'm tekbot).
* Create new Graph for existing device (This creates the datasource in the Database)
* Create input file for list of existing Graphs (.rrd files) from old cacti installation to convert (see section on INPUT_FILE in the post where you found these scripts).
* Edit the Constants at the top of GetAndConvert and FormatRRA to suit your needs.
* Setup SSH Private Key Authentication (Optional)
* Run the script (Syntax: ./GetAndConvert INPUT_FILE)
* Update Data source to point to new RRD.
* Place on Tree!

Have fun kids, and be safe!

--tekbot

Just in case anyone ever finds this and starts to look into it...

The script is nonsense and will just destroy your data. Don't do it. Don't bother even looking into it. It does transform the rra header information similar to what is necessary, but it just munges up the row data with no respect to what that data actually means. I am guessing the original author just didn't understand what was in that rra row data.

I appreciate someone trying to do this, but this is not the solution. I am looking into it myself, but I've figured out that the level of transformation needed for the rows to match up the new information is more work than I am willing to do. I am just going to rename my old devices to *-OLD, deactivate them so that they don't poll any more, create new ones based off of the old devices.

This is technically possible to achieve a transform script that could do this, but it would require an amount of work that I'm not willing to do, since I only have a couple of hosts and it's pretty easy to archive the old devices, keep them available for a year or two, when the data will no longer be relevant, and just start new.

Most unfortunate. I would have thought that someone would have come up with an off-the-shelf script to do this conversion work, but after looking into it, I realize the data transformation that would be required on the rra rows in the rrd file after exporting to xml. More talent and time is required there than I've got.
rickg421
Posts: 6
Joined: Fri Sep 16, 2011 2:12 pm

Re: [HOWTO] 0.8.7 and 1 minute polling

Post by rickg421 »

I've hit a bit of a wall trying to setup 1 minute polling. I understand from this thread that the theory is to run the poller once every 5 minutes from cron, cron interval = 5 minutes, and with the poller interval at 1 minute, it will execute once a minute. Now, if I have everything set to 1 minute, it works. One minute graphs appear, so I believe that confirms my RRA's are setup. But if I set cron to run @ 5 minutes w/ its 5 minute interval, things don't work. I just get POLLER logs like this:

Code: Select all

09/29/2011 02:21:01 PM - POLLER: Poller[0] NOTE: There are no items in your poller for this polling cycle!
09/29/2011 02:22:01 PM - POLLER: Poller[0] NOTE: There are no items in your poller for this polling cycle!
Rebuilding the poller cache (via cli) doesn't change the behavior, either.
moggie
Posts: 11
Joined: Sat Jul 31, 2010 9:38 am

Re: [HOWTO] 0.8.7 and 1 minute polling

Post by moggie »

rickg421 wrote:I've hit a bit of a wall trying to setup 1 minute polling. I understand from this thread that the theory is to run the poller once every 5 minutes from cron, cron interval = 5 minutes, and with the poller interval at 1 minute, it will execute once a minute. Now, if I have everything set to 1 minute, it works. One minute graphs appear, so I believe that confirms my RRA's are setup. But if I set cron to run @ 5 minutes w/ its 5 minute interval, things don't work. I just get POLLER logs like this:

Code: Select all

09/29/2011 02:21:01 PM - POLLER: Poller[0] NOTE: There are no items in your poller for this polling cycle!
09/29/2011 02:22:01 PM - POLLER: Poller[0] NOTE: There are no items in your poller for this polling cycle!
Rebuilding the poller cache (via cli) doesn't change the behavior, either.
One minute polling was slightly broken in 0.8.7g, but is fixed now in version 0.8.7h. If you update to the most recent version and set your cron interval back to 5 minutes, you should find it magically works again.
Post Reply

Who is online

Users browsing this forum: No registered users and 0 guests