Jun/100
NagVis 1.5 released
NagVis 1.5 is the new major and stable release I just announced. It contains a lot of new features and performance improvements compared to the NagVis 1.4x releases.
My personal favorite new features are
- The new MKLivestatus backend which connects directly to the Nagios core using the MKLivestatus event broker module. No database anymore. No NDO anymore.
- The weathermap lines are pretty cool. They can be used to visualize the load of a network connection using fancy colored lines.
These are only my personal highlights. Simply take a look at the Nagvis news page for the official announcement.
Jun/100
Nagios Workshop 2010 is over
The community event Nagios Workshop 2010 in Nürnberg/Germany is over now. It was a very successful event and a nice platform for exchanging information round about Open Source Monitoring Solutions using Nagios and/or Icinga. It was nice to meet the members of the German Nagios Community again.
I uploaded the slides of my NagVis presentation (German) on the publications page on NagVis.org.
Last but not least thanks for the nice meeting place to the company qSkills and the organization of the event to the company team(ix) and in person to Sven Velt. Nice Job!
May/100
See you on Nagios Workshop 2010!
The community organized German Nagios Workshop 2010 will happen in Nürnberg Germany this year. The workshop is being organized by Sven Velt in this year.
The event focuses on technical topics for Nagios users and addon developer. We have a 2 day event full of Monitoring and Nagios related topics.
There are very interesting topics on the list. For example Mathias Kettner talks about check_mk and the very young addons MKLivestatus and the Multisite GUI, Gerhard Lausser will give a view into the Nagios rewrite Shinken, Sven Nierlein will introduce the MKLivestatus based webinterface Thruk and there will be talks about the basics of RRDTool (Simon Meggle) and the RRDTool Caching Daemon (Sebastian „tokkee“ Harl). There are several other topics on the list. Just take a look at it.
Last but not least: I’ll talk about NagVis 1.5 and the newest steps in development (Need to take a look at the slides now …).
I really look forward to this event. I’ll start the journey to Nürnberg tomorrow evening from Munich.
May/102
Nagios Statusmap – trash it!
The statusmap of Nagios is delivered with the default web interface since more than 10 years now. There was no major change or innovation in the Nagios statusmap in this time. Even within the Icinga fork there was no real change to make the statusmap more usable.
Apr/100
MKLivestatus – Experience the new way getting Nagios live data
I introduced MKLivestatus before so I won’t get in detail what MKLivestatus is and what it is meant for. Today I give you some simple examples on how straight forward it is to get status information out of Nagios using Livestatus.
Mar/108
LivestatusSlave – Webservice for MKLivestatus
MKLivestatus is a Nagios Event Broker (NEB) Module which can be used to extend the core of Nagios. The MKLivestatus module provides access to the live status information kept in the running Nagios process. It serves a unix socket for data exchange with external scripts/addons.
Making Livestatus available on the Network
It is possible to make the Livstatus unix socket (which is only available on the system where Livestatus runs) to remote systems on the network e.g. using xinetd.
Using xinetd the unix socket is served as tcp socket on the network. This tcp/unix sockets can be queried using the most programming languages.
But in many cases it would be better to have an easier way reaching the Livestatus information.
A big benefit could be to have the livestatus socket available via HTTP to make it queryable for example using XMLHttpRequest.
With this idea the LivestatusSlave was born.
What is the LivestatusSlave?
The LivestatusSlave is a so called “webservice” written in PHP. Basically LivestatusSlave is a single PHP script which gets the plain livestatus query from a parameter and returns the livestatus response as array in JSON syntax.
LivestatusSlave does not really care about authentication, authorisation or the syntax of the livestatus query. It is only a small translator between the HTTP client and the Livestatus socket.
Downloading LivestatusSlave
LivestatusSlave is in early developement. A package of livestatus slave can be downloaded here: livestatus-slave-1.1.tar.gz.
There is also a public git repository available which contains the newest sources: mklivestatus-slave.
System Requirements
LivestatusSlave needs a webserver which supports at least PHP 5. The PHP needs suport for json and socket functions. You might need to install additional packages to get those modules.
And you also need a running Nagios with a loaded MKLivestatus NEB module.
Installing LivestatusSlave
Just drop the live.php somewhere on your system where it is reachable via a webserver which supports PHP. For example you could place it in your nagios/share directory.
Then you need to edit the $conf Array in live.php to point to your Livestatus socket path.
Example
I placed the live.php in my nagios/share directory so it is reachable now via:
http://<my-nagios-server>/nagios/live.php
It is very easy to query the LivestatusSlave. Simply open the live.php in your Browser with the following URL:
http://<my-nagios-server>/nagios/live.php?q=GET hosts\nColumns: name state\nFilter: name = www.nagvis.org\n
Now I get the following response:
[[0,"OK"],[["www.nagvis.org",0]]]
More readable and with comments added:
[ // Header [ // Response Code 0, // Response Message "OK" ], // Body [["www.nagvis.org",0]] ]
The response is in JSON format. It is an array where the first element is the header which is an array itselfs and the second element is the response body which may be an array of elements.
The response header is built of two elements. The first element is the response code, the second element is the description of the response, for example an error message.
The response code is 0 on a successful query and different than 0 when a problem occured.
Feb/101
Minify event_broker_options for Nagios Business Process AddOns
The Nagios Business Process AddOns use the NDOUtils and the NDO database as datasource. So you need to add the NDOUtils NEB module to your Nagios core.
Leaving the NDO configuration in the nagios.cfg and ndomod.cfg with the default values may lead into needless performance problems.
You can really optimize the information which are forwarded from Nagios to the NDO database using the event_broker_options in your nagios.cfg and data_processing_options in the ndomod.cfg.
For more details take a look at the description of the event_broker_options parameter.
nagios.cfg: event_broker_options
To minify the data when using the Nagios Business Process AddOns set the event_broker_options value to 36865. This value below is only valid for the Nagios side of the NEB API (Means the option event_broker_options in nagios.cfg). The option is calculated as follows:
1 (BROKER_PROGRAM_STATE) + 4096 (BROKER_STATUS_DATA) + 32768 (BROKER_RETENTION_DATA) ------- = 36865
Be aware: Changing that value will affect ALL your connected NEB modules. This may result in problems when using more than one NEB module.
ndomod.cfg: data_processing_options
The Nagios Business Process AddOns needs some more information, like the object configuration (hosts, services, groups, contacts, …). This behavior can not be controlled in the nagios.cfg but only in the ndomod.cfg.
The correct value for the option data_processing_options in ndomod.cfg is calculated as follows:
1 (NDOMOD_PROCESS_PROCESS_DATA) + 4096 (NDOMOD_PROCESS_HOST_STATUS_DATA) + 8192 (NDOMOD_PROCESS_SERVICE_STATUS_DATA) + 262144 (NDOMOD_PROCESS_OBJECT_CONFIG_DATA) + 2097152 (NDOMOD_PROCESS_RETENTION_DATA) --------- = 2371585
Feb/101
Nagios Event Broker: event_broker_options
The Nagios Event Broker (NEB) API is very powerful. A large amount of different events in the Nagios Core can be hooked using the NEB modules which are connected to the API. With that many events and type of events there might be a big overhead cause particular modules only need a little group of those events.
There is a default way to control which type of events are forwarded to the NEB modules. The option is called event_broker_options and located in the Nagios configuration file (nagios.cfg).
Feb/100
MKLivestatus 1.1.2 released
MKLivestatus has been released in version 1.1.2 with the newest release of the check_mk package.
The biggest change of the new version is that MKLivestatus offers access to the Nagios logs via the MKLivestatus unix socket. With this improvement it is now possible to access all current status information from the Nagios core and all historical Nagios data from the Nagios log.
Additional to that major change there were several misc changes:
- Added some new columns about Nagios status data to stable ‘status’
- Added new table “comments”
- Added logic for count of pending service and hosts
- Added several new columns in table ‘status’
- Added new columns flap_detection and obsess_over_services in table services
- Fixed bug for double columns: filter truncated double to int
- Added new column status:program_version, showing the Nagios version
- Added new column num_services_pending in table hosts
- Fixed several compile problems on AIX
- Fixed bug: queries could be garbled after interrupted connection
- Fixed segfault on downtimes:contacts
- New feature: sum, min, max, avg and std of columns in new syntax of Stats:
With this release MKLivestatus has the potential to become the new standard for Nagios Addons for fetching status information from the Nagios core. It is dead simply and very performant to fetch the status information using MKLivestatus. As i’ve written before NagVis supports MKLivestatus as backend since NagVis 1.4.5. I recommend to throw away the NDO and migrate to MKLivestatus when you don’t need the NDO for anything else.
There is also a new Nagios webinterface in development which is called Thruk. It is kept in classic Nagios webinterface design but is perl based and uses MKLivestatus as data source.
The current version of MKLivestatus can be downloaded at the website of MKLivestatus.
Jan/100
Nagios Plugin: check_fsrm_quota.pl
You can find the current version of the Nagios check script for checking FSRM (File system ressource manager) quota on Microsoft Windows servers on this page. Furthermore you can find some descriptions and sample configurations.
Idea
It is possible to configure filesystem quotas on fileservers using Windows 2003 R2 or above on directory base. This is realized using the File Server Resource Manager (FSRM). For me it was important not only to have a limit and/or a notification mail from the FSRM but also a centralized graphing for trend analyzes and centralized visualisation in NagVis.
Until today I found no other way gathering those information than checking a Windows Server using NRPE. Since I didn’t want to modify anything on the Windows Fileservers I set up a dedicated Windows host for working as monitoring proxy between Nagios and the FSRM in the Windows world. So my check chain for this task looks as follows:
Nagios |--- Active Check ---> NSClient++ (NRPE enabled) |--- System call ---> check_fsrm_quota.pl |--- FSRM ---> Fileserver
Sample output
Prerequisites
The check_fsrm_quota.pl is located on the “proxying” Windows host. The script is written in Perl so Active Perl needs to be installed and usable on that. The script only uses the Getopt::Long module so there are no special Perl module requirements.
<
p>It is needed to connect to the FSRM service on the Windows fileservers so the dirquota.exe and srm.dll are needed on the “proxying” Windows host.
check_fsrm_quota.pl
Copy the file to the Windows host to a directory of your choice. I copied it to C:\scripts.
# ############################################################################## # check_fsrm_quota.pl - Nagios / NRPE check plugin (Windows) # # 2009-02-09 Lars Michelsen <lars@vertical-visions.de> # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, # # GNU General Public License: http://www.gnu.org/licenses/gpl-2.0.txt # # ############################################################################## # SCRIPT: check_fsrm_quota.pl # VERSION: 1.1 # AUTHOR: Lars Michelsen <lm@larsmichelsen.com> # DECRIPTION: Checks the FSRM service from local or remote for the quota of # a defined directory. Thresholds for WARNINGs and CRITICALs # can be configured in this script. # HOMEPAGE: <http://nagios.larsmichelsen.com/check_fsrm_quota/> # BUGS: <http://www.nagios-portal.org/> # CHANGES: # 2009-02-09 v1.0 - Initial code # 2009-03-15 v1.0 - Added perfdata output # - Some code cleanups # ############################################################################## use warnings; use strict; use Getopt::Long; my $sRemoteHost; my $sQuotaPath; my $sWarn; my $sCrit; # Path to binaries my $sDirquotaExe = 'dirquota.exe'; my $sOutput = ''; my $iState = 0; my $sState = ''; my $sPerfdata = ''; my $iSummaryState = 0; my $sSummaryState = ''; my @aStates = ('OK', 'WARNING', 'CRITICAL', 'UNKNOWN'); my %hStates = ( OK => 0, WARNING => 1, CRITICAL => 2, UNKNOWN => -1 ); ############################################################################### my ($oHelp,$oHostname,$oPath,$oWarn,$oCrit); Getopt::Long::Configure('bundling'); GetOptions( 'h|help' => \$oHelp, 'H|host=s' => \$oHostname, 'P|path=s' => \$oPath, 'w|warn=s' => \$oWarn, 'c|crit=s' => \$oCrit); if($oHelp) { print <<EOU; Usage: $0 -H <FQDN/IP: string> -P <Path: string> -w <warning level: integer> -c <critical level: integer> $0 -h Options: -H --host STRING FQDN or IP-Address of the Windows Fileserver (W2k3 R2 or above) -P --path STRING Full system path the Quota has been configured for -w --warn INTEGER Warning treshold. Give the minimum memory in bytes, MB or GB. Examples: - Give 10GB to get a WARNING state when the free space is 10GB below the hard quota limit. - Give 1048576 to get a WARNING state when the free space is 1GB below the hard quota limit. -c --crit INTEGER Critical treshold. Give the minimum memory in bytes, MB or GB. Example: - Give 5GB to get a CRITICAL state when the free space is 5GB below the hard quota limit. -h --help Print this help text EOU exit 0; } if(!$oHostname || $oHostname eq '') { print('ERROR: No hostname given'); exit 0; } if(!$oPath || $oPath eq '') { print('ERROR: No path given'); exit 0; } if(!$oWarn || $oWarn eq '') { print('ERROR: No warning treshold given'); exit 0; } if(!$oCrit || $oCrit eq '') { print('ERROR: No critical treshold given'); exit 0; } $sRemoteHost = $oHostname; $sQuotaPath = $oPath; $sWarn = $oWarn; $sCrit = $oCrit; ############################################################################### # Register dll $sOutput = `regsvr32 /s c:\\windows\\system32\\srm.dll`; # 1. Query quota list $sOutput = `$sDirquotaExe Quota List /remote:$sRemoteHost /Path:$sQuotaPath`; # Split on each line break my @aOutput = split(/\n/, $sOutput); # Loop each quota entry and fill hash my @quotas = (); my $i = -1; foreach my $line (@aOutput) { if($line =~ /^([A-Za-z\s]+):\s+(.+)$/) { my $label = $1; my $value = $2; # Remove signs in brackets $value =~ s/\s+\(.+\)//g; if($label eq 'Quota Path') { push @quotas, {path => $value}; $i++; } elsif($label eq 'Source Template') { $quotas[$i]->{'template'} = $value; } elsif($label eq 'Label') { $quotas[$i]->{'label'} = $value; } elsif($label eq 'Quota Status') { $quotas[$i]->{'status'} = $value; } elsif($label eq 'Limit') { $quotas[$i]->{'limit'} = $value; } elsif($label eq 'Used') { $quotas[$i]->{'used'} = $value; } elsif($label eq 'Available') { $quotas[$i]->{'available'} = $value; } } } # Now loop the quotas foreach my $quota (@quotas) { my %quota = %{$quota}; # Set initial state $iState = $hStates{'OK'}; $sState = 'OK: Available space: '.$quota{'available'}; my $val = str2bytes($quota{'available'}); $sPerfdata = 'available='.$val.'b;'.str2bytes($sWarn).';'.str2bytes($sCrit); if($val < str2bytes($sWarn)) { $iState = $hStates{'WARNING'}; $sState = 'WARNING: Free space is lower than '.$sWarn.' (Available: '.$quota{'available'}.')'; } if($val < str2bytes($sCrit)) { $iState = $hStates{'CRITICAL'}; $sState = 'CRITICAL: Free space is lower than '.$sCrit.' (Available: '.$quota{'available'}.')'; } } ############################################################################### # Build summary output print $sState.' | '.$sPerfdata; exit($iState); ############################################################################### sub str2bytes { my ($str) = @_; if($str =~ m/^([0-9\.\,]+)\s*([A-Z]+)/) { my $val = $1; my $uom = $2; # Change , to . $val =~ s/,/./g; if($uom eq 'GB') { $str = $val * 1024 * 1024 * 1024; } elsif($uom eq 'MB') { $str = $val * 1024 * 1024; } } return $str; }
PNP-Template
I created no custom template yet. So I use the default PNP template for this script. If you created some please let me know.
Parameters
Usage: C:\scripts\check_fsrm_quota.pl -H <FQDN/IP: string> -P <Path: string> -w <warning level: integer> -c <critical level: integer> C:\scripts\check_fsrm_quota.pl -h Options: -H --host STRING FQDN or IP-Address of the Windows Fileserver (W2k3 R2 or above) -P --path STRING Full system path the Quota has been configured for -w --warn INTEGER Warning treshold. Give the minimum memory in bytes, MB or GB. Examples: - Give 10GB to get a WARNING state when the free space is 10GB below the hard quota limit. - Give 1048576 to get a WARNING state when the free space is 1GB below the hard quota limit. -c --crit INTEGER Critical treshold. Give the minimum memory in bytes, MB or GB. Example: - Give 5GB to get a CRITICAL state when the free space is 5GB below the hard quota limit. -h --help Print this help text
Sample configuration
Here are some sample command and service definition for Nagios configuration.
define command { command_name check_proxy_fsrm_quota command_line $USER1$/check_nrpe -H <windows-gatway-host> -u -t 10 -c "check_fsrm_quota" -a "$HOSTADDRESS$" "$ARG1$" "$ARG2$" "$ARG3$" } define service { host_name <hostname> service_description fsrm-quota-test check_command check_proxy_fsrm_quota!D:/path/to/directory!5GB!2GB use template-check-10-min }
And here the command I used in the configuration file of NSClient++, nsc.ini:
[NRPE Handlers] check_fsrm_quota=perl c:\scripts\check_fsrm_quota.pl -H $ARG1$ -P $ARG2$ -w $ARG3$ -c $ARG4$
Note: Please not that using additional command ARGs in NRPE like the above example uses may result in a security problem. Make sure to secure your installation to prevent code injection attacks.
To gather the available quotas on a remote system you can manually execute the following command on your Windows proxy host:
dirquota.exe Quota List /remote:<windows-fileserver>






