1
Feb/10
0

MKLivestatus 1.1.2 released

MKLivestatus has been released in version 1.1.2 with the newest release of the check_mk package.

The biggest change of the new version is that MKLivestatus offers access to the Nagios logs via the MKLivestatus unix socket. With this improvement it is now possible to access all current status information from the Nagios core and all historical Nagios data from the Nagios log.

Additional to that major change there were several misc changes:

  • Added some new columns about Nagios status data to stable ‘status’
  • Added new table “comments”
  • Added logic for count of pending service and hosts
  • Added several new columns in table ‘status’
  • Added new columns flap_detection and obsess_over_services in table services
  • Fixed bug for double columns: filter truncated double to int
  • Added new column status:program_version, showing the Nagios version
  • Added new column num_services_pending in table hosts
  • Fixed several compile problems on AIX
  • Fixed bug: queries could be garbled after interrupted connection
  • Fixed segfault on downtimes:contacts
  • New feature: sum, min, max, avg and std of columns in new syntax of Stats:

With this release MKLivestatus has the potential to become the new standard for Nagios Addons for fetching status information from the Nagios core. It is dead simply and very performant to fetch the status information using MKLivestatus. As i’ve written before NagVis supports MKLivestatus as backend since NagVis 1.4.5. I recommend to throw away the NDO and migrate to MKLivestatus when you don’t need the NDO for anything else.

There is also a new Nagios webinterface in development which is called Thruk. It is kept in classic Nagios webinterface design but is perl based and uses MKLivestatus as data source.

The current version of MKLivestatus can be downloaded at the website of MKLivestatus.

Filed under: Nagios
25
Jan/10
0

Nagios Plugin: check_fsrm_quota.pl

You can find the current version of the Nagios check script for checking FSRM (File system ressource manager) quota on Microsoft Windows servers on this page. Furthermore you can find some descriptions and sample configurations.

Idea

It is possible to configure filesystem quotas on fileservers using Windows 2003 R2 or above on directory base. This is realized using the File Server Resource Manager (FSRM). For me it was important not only to have a limit and/or a notification mail from the FSRM but also a centralized graphing for trend analyzes and centralized visualisation in NagVis.

Until today I found no other way gathering those information than checking a Windows Server using NRPE. Since I didn’t want to modify anything on the Windows Fileservers I set up a dedicated Windows host for working as monitoring proxy between Nagios and the FSRM in the Windows world. So my check chain for this task looks as follows:

Nagios |--- Active Check ---> NSClient++ (NRPE enabled) |--- System call ---> check_fsrm_quota.pl |--- FSRM ---> Fileserver

Sample output check_fsrm_quota sample output

Prerequisites

The check_fsrm_quota.pl is located on the “proxying” Windows host. The script is written in Perl so Active Perl needs to be installed and usable on that. The script only uses the Getopt::Long module so there are no special Perl module requirements.

<

p>It is needed to connect to the FSRM service on the Windows fileservers so the dirquota.exe and srm.dll are needed on the “proxying” Windows host.

check_fsrm_quota.pl

Copy the file to the Windows host to a directory of your choice. I copied it to C:\scripts.

# ##############################################################################
# check_fsrm_quota.pl - Nagios / NRPE check plugin (Windows)
#
# 2009-02-09 Lars Michelsen <lars@vertical-visions.de>
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of the GNU General Public License
# as published by the Free Software Foundation; either version 2
# of the License, or (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; if not, write to the Free Software
# Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307,
#
# GNU General Public License: http://www.gnu.org/licenses/gpl-2.0.txt
#
# ##############################################################################
# SCRIPT:          check_fsrm_quota.pl
# VERSION:         1.1
# AUTHOR:          Lars Michelsen <lm@larsmichelsen.com>
# DECRIPTION:      Checks the FSRM service from local or remote for the quota of
#                  a defined directory. Thresholds for WARNINGs and CRITICALs
#                  can be configured in this script.
# HOMEPAGE:        <http://nagios.larsmichelsen.com/check_fsrm_quota/>
# BUGS:            <http://www.nagios-portal.org/>
# CHANGES:
# 2009-02-09 v1.0  - Initial code
# 2009-03-15 v1.0  - Added perfdata output
#                  - Some code cleanups
# ##############################################################################
 
use warnings;
use strict;
use Getopt::Long;
 
my $sRemoteHost;
my $sQuotaPath;
my $sWarn;
my $sCrit;
 
# Path to binaries
my $sDirquotaExe = 'dirquota.exe';
 
my $sOutput = '';
my $iState = 0;
my $sState = '';
my $sPerfdata = '';
my $iSummaryState = 0;
my $sSummaryState = '';
my @aStates = ('OK', 'WARNING', 'CRITICAL', 'UNKNOWN');
my %hStates = ( OK => 0, WARNING => 1, CRITICAL => 2, UNKNOWN => -1 );
 
###############################################################################
 
my ($oHelp,$oHostname,$oPath,$oWarn,$oCrit);
Getopt::Long::Configure('bundling');
GetOptions(
    'h|help' => \$oHelp,
    'H|host=s' => \$oHostname,
    'P|path=s' => \$oPath,
    'w|warn=s' => \$oWarn,
    'c|crit=s' => \$oCrit);
 
if($oHelp) {
print <<EOU;
  Usage: $0 -H <FQDN/IP: string> -P <Path: string> 
            -w <warning level: integer> -c <critical level: integer>
         $0 -h
 
 
    Options:
 
    -H --host STRING
        FQDN or IP-Address of the Windows Fileserver (W2k3 R2 or above)
    -P --path STRING
        Full system path the Quota has been configured for
    -w --warn INTEGER
        Warning treshold. Give the minimum memory in bytes, MB or GB.
 
        Examples:
        - Give 10GB to get a WARNING state when the free space is 10GB below
        the hard quota limit.
        - Give 1048576 to get a WARNING state when the free space is 1GB below
        the hard quota limit.
    -c --crit INTEGER
        Critical treshold. Give the minimum memory in bytes, MB or GB.
 
        Example:
        - Give 5GB to get a CRITICAL state when the free space is 5GB below
        the hard quota limit.
    -h --help
        Print this help text
EOU
	exit 0;
}
 
if(!$oHostname || $oHostname eq '') {
	print('ERROR: No hostname given');
	exit 0;
}
 
if(!$oPath || $oPath eq '') {
	print('ERROR: No path given');
	exit 0;
}
 
if(!$oWarn || $oWarn eq '') {
	print('ERROR: No warning treshold given');
	exit 0;
}
 
if(!$oCrit || $oCrit eq '') {
	print('ERROR: No critical treshold given');
	exit 0;
}
 
$sRemoteHost = $oHostname;
$sQuotaPath = $oPath;
$sWarn = $oWarn;
$sCrit = $oCrit;
 
###############################################################################
 
# Register dll
$sOutput = `regsvr32 /s c:\\windows\\system32\\srm.dll`;
 
# 1. Query quota list
$sOutput = `$sDirquotaExe Quota List /remote:$sRemoteHost /Path:$sQuotaPath`;
 
# Split on each line break
my @aOutput = split(/\n/, $sOutput);
 
# Loop each quota entry and fill hash
my @quotas = ();
my $i = -1;
foreach my $line (@aOutput) {
	if($line =~ /^([A-Za-z\s]+):\s+(.+)$/) {
		my $label = $1;
		my $value = $2;
 
		# Remove signs in brackets
		$value =~ s/\s+\(.+\)//g;
 
		if($label eq 'Quota Path') {
			push @quotas, {path => $value};
			$i++;
		} elsif($label eq 'Source Template') {
			$quotas[$i]->{'template'} = $value;
		} elsif($label eq 'Label') {
			$quotas[$i]->{'label'} = $value;
		} elsif($label eq 'Quota Status') {
			$quotas[$i]->{'status'} = $value;
		} elsif($label eq 'Limit') {
			$quotas[$i]->{'limit'} = $value;
		} elsif($label eq 'Used') {
			$quotas[$i]->{'used'} = $value;
		} elsif($label eq 'Available') {
			$quotas[$i]->{'available'} = $value;
		}
	}
}
 
# Now loop the quotas
foreach my $quota (@quotas) {
	my %quota = %{$quota};
 
	# Set initial state
	$iState = $hStates{'OK'};
	$sState = 'OK: Available space: '.$quota{'available'};
 
	my $val = str2bytes($quota{'available'});
 
	$sPerfdata = 'available='.$val.'b;'.str2bytes($sWarn).';'.str2bytes($sCrit);
 
	if($val < str2bytes($sWarn)) {
		$iState = $hStates{'WARNING'};
		$sState = 'WARNING: Free space is lower than '.$sWarn.' (Available: '.$quota{'available'}.')';
	}
 
	if($val < str2bytes($sCrit)) {
		$iState = $hStates{'CRITICAL'};
		$sState = 'CRITICAL: Free space is lower than '.$sCrit.' (Available: '.$quota{'available'}.')';
	}
}
 
###############################################################################
 
# Build summary output
print $sState.' | '.$sPerfdata;
exit($iState);
 
###############################################################################
 
sub str2bytes {
	my ($str) = @_;
 
	if($str =~ m/^([0-9\.\,]+)\s*([A-Z]+)/) {
		my $val = $1;
		my $uom = $2;
 
		# Change , to .
		$val =~ s/,/./g;
 
		if($uom eq 'GB') {
			$str = $val * 1024 * 1024 * 1024;
		} elsif($uom eq 'MB') {
			$str = $val * 1024 * 1024;
		}
	}
 
	return $str;
}

PNP-Template

I created no custom template yet. So I use the default PNP template for this script. If you created some please let me know.

Parameters

  Usage: C:\scripts\check_fsrm_quota.pl -H <FQDN/IP: string> -P <Path: string>
                      -w <warning level: integer> -c <critical level: integer>
         C:\scripts\check_fsrm_quota.pl -h
 
 
    Options:
 
    -H --host STRING
        FQDN or IP-Address of the Windows Fileserver (W2k3 R2 or above)
    -P --path STRING
        Full system path the Quota has been configured for
    -w --warn INTEGER
        Warning treshold. Give the minimum memory in bytes, MB or GB.
 
        Examples:
        - Give 10GB to get a WARNING state when the free space is 10GB below
        the hard quota limit.
        - Give 1048576 to get a WARNING state when the free space is 1GB below
        the hard quota limit.
    -c --crit INTEGER
        Critical treshold. Give the minimum memory in bytes, MB or GB.
 
        Example:
        - Give 5GB to get a CRITICAL state when the free space is 5GB below
        the hard quota limit.
    -h --help
        Print this help text

Sample configuration

Here are some sample command and service definition for Nagios configuration.

define command {
	command_name  check_proxy_fsrm_quota	
	command_line  $USER1$/check_nrpe -H <windows-gatway-host> -u -t 10 -c "check_fsrm_quota" -a "$HOSTADDRESS$" "$ARG1$" "$ARG2$" "$ARG3$"
}
 
define service {
  host_name             <hostname>
  service_description   fsrm-quota-test
  check_command         check_proxy_fsrm_quota!D:/path/to/directory!5GB!2GB
 
	use                   template-check-10-min
}

And here the command I used in the configuration file of NSClient++, nsc.ini:

[NRPE Handlers]
check_fsrm_quota=perl c:\scripts\check_fsrm_quota.pl -H $ARG1$ -P $ARG2$ -w $ARG3$ -c $ARG4$

Note: Please not that using additional command ARGs in NRPE like the above example uses may result in a security problem. Make sure to secure your installation to prevent code injection attacks.

To gather the available quotas on a remote system you can manually execute the following command on your Windows proxy host:

dirquota.exe Quota List /remote:<windows-fileserver>

Filed under: Nagios
18
Jan/10
1

Ninja – The alternative Nagios GUI

Ninja is a new webinterface for Nagios. Ninja is the first project which intended to create the new Nagios GUI to replace the existing CGIs completely.

The project has the activity, power and relations to bring that task to success. Maybe someday Ninja will be included within the official Nagios package. But it seems this will still take some time. The current version of Ninja is 0.3.6.

The benefits

Tactical Overview of Ninja Ninja solves different requirements which could not be solved by the CGI based webinterface of Nagios. The Ninja GUI is written in PHP which results in a wider flexibility and a bigger community to contribute to the project. Ninja is template based which makes it easy to change the style of the GUI to fit corporate standards. There is also multilanguage support which may help some users in multi lingual environments.

The Ninja frontend intends to be much more scalable to work well in very large environments where the CGIs are unusable. The Ninja team has an eye on easy integration of 3rd party addons for Nagios like pnp4nagios, NagVis and so on.

During the developement of Ninja the developers started working on extending the NagVis project with the geomap which is a new Google maps based way of viualizing objects. The geomap is based on Flash Flex and part of the official NagVis package.

Connecting Nagios and Ninja

Ninja uses the Merlin MySQL Database as default information source. The initial target of Merlin was to make the setup of distributed Nagios installations easier. Seeing the Merlin Database can easily be used for simple connections of 3rd party applications to Nagios in even single seated Nagios installations the developers decided to use it as default status information backend for Ninja.

You only need a MySQL database to setup the Merlin MySQL Database for Ninja. The setup is like for the NDO. You need to setup the Merlin MySQL Database, add a Nagios Event Broker module to the Nagios process, start the Merlin Daemon (merlind) and enjoy it.

Developement

The Open Source project has been started and is powered by op5. op5 is a Swedish IT company which focuses on Open Source IT monitoring. They release a lot of their programs and solutions as Open Source and create benefits for other users. The platform for such releases is op5.org.

You can take a look at the Ninja webinterface on the demo page. You can access it with the user and password: monitor.

Filed under: Nagios
13
Jan/10
1

Nagios Init-Script verifyconfig with verbose output

Did you ever wanted to restart your Nagios process and got the following message after a change to the Nagios configuration?:

CONFIG ERROR! Check your Nagios configuration.

If you changed anything in a Nagios configuration file at least once in your lifetime I am sure you know that message. Now what to do? If you are experienced with this error message you know what to do: Run the Nagios binary with the -v (verifyconfig) parameter. Mostly the command looks like this:

:> /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Having the command output you can easily find the source of the problem in your Nagios configuration.

But wait! Why is it so complicated?

11
Jan/10
1

Thruk – A MKLivestatus webinterface

Thruk is a Webinterface for Nagios. The Nagios Webinterface called Thruk is the second Nagios Addon based on MKLivestatus. You will ask “Why yet another webinterface?”. The difference to all the other new Nagios Webinterfaces is in the focuses and goals of Thruk.

6
Jan/10
1

ICINGA

Just one year ago this topic would be totally senseless. But different things happened with the Nagios project that it was just a question of time that a fork of Nagios would become reality.

I wrote some words about ICINGA, the Nagios fork, some time ago so I don’t want to repeat these words. But I want to add some background information and thoughts about the current things that happen around.

ICINGA – The benefits

Today it seems to me that the biggest innovation in the ICINGA project is the new webinterface. The webinterface makes use of different PHP and JavaScript frameworks which make the ICINGA GUI look very much like the marriage of the Web 2.0 and the Nagios world. That may be a real benefit for the users which are messed up with the old Nagios frontend.
I also see a benefit in the IDO, which is a fork of the NDO. The DB team of the ICINGA project works on fixing the biggest problems in that database output layer. They also work hard to make the usage of other databases like Oracle for the IDO possible.

A clear statement right from the beginning of ICINGA was not to break up with Nagios and leaving the Core compatible to the Nagios Core. That is a benefit not just for users which want to switch from Nagios to ICINGA it is also a benefit for the users which like to switch back. Switching back from ICINGA may sound strange from the current point of view but I think it may be a valid step in the future.

Should ICINGA really be independent?

Expect the Nagios and ICINGA Core don’t diverge and be nearly the same as they are today a re-merge could be an option. In this case I would take the step and reduce the ICINGA project to a Nagios webinterface with IDO as database backend. Reducing the ICINGA project which is a total Nagios fork today to a webinterface may sound strange since “yet another Nagios interface” does not sound as pretty as the own monitoring solution. From my point of view at the moment ICINGA is exactly that and nothing different.
I might be wrong cause I may see not all details of the current progress of the ICINGA project but that’s what I see on the surface.

Nagios inactivity – The boost for ICINGA?

Other things may happen. Ethan Galstad is working on Nagios XI which will be a package which contains Nagios and several addons which are bundled to one package. That solution is an enterprise product so it will need to be paid. It may be that the Nagios core will benefit from this development but in the first instance there is less time to push the development of Nagios. So this may again result in inactivity of the Nagios developer team.
Having this inactivity Nagios will fall behind and there will be hope that there is more activity in the ICINGA Core project. But since the ICINGA project don’t want to break up with Nagios they may shrink from making more complex changes to the ICINGA Core which may result in some incompatibilities. This may result in total inactivity in the Nagios/ICINGA core – but it may also come totally different.

Shinken may shine brighter

Another approach came up several days ago which totally breaks up with Nagios: Shinken. The name Shinken sounds funny in German cause it sounds like the German word for ham: Schinken. Shinken is not another fork of Nagios. It is a totally reworked Nagios which is written in Python. It seems not to cause an performance issue. Apart from that, the project has some very interesting ideas which should be a big benefit for many Nagios installations. It seems that this could breath some additional fresh air into the world of Nagios.

Filed under: Nagios
4
Jan/10
2

Nagios Configuration

Nagios is configured through plain text files with a special syntax. There are two types configuration files in Nagios 3.x.

There are the main configuration files which control the behavior of the single components. For example the nagios.cfg which controlls the Nagios Daemon. The cgi.cfg is used to configure the CGI based web frontend. Another special file is the resource.cfg which can be used to configure general options which should be available in the object configuration files.

The second type of Nagios configuration files are the object configuration files. These files are used to add objects like hosts, services, hostgroups, servicegroups, commands, timeranges and so on.

More information about this can be found in the Official Nagios Documentation.

nagios.cfg

The nagios.cfg uses a simple syntax: key=value. Comment lines begin with a #.

The default nagios configuration file which is created when executing make install-config. The nagios.cfg contains many comments with useful information about the single configuration options.

All important information like the place of the configuration files is set in the nagios.cfg file. So this file is the central place to control the configuration of Nagios.

The Nagios Documentation is very complete at this place.

Nagios Object Configuration Files

The object configuration files are fetched like configured in the nagios.cfg file. These object configuration files can be fetched using the cfg_file for single files and the cfg_dir option to add all files in a directory.

The Nagios Documentation is very complete at this place so I won’t write anymore about the syntax of these files.

Flexible object configuration concept

While a Nagios installation grows many Nagios administrators recognize the number of objects strongly grows and grows too. Without a clear and flexible concept for managing the Nagios object configurations growing Nagios installations will end up in pure chaos.

Starting with Nagios it is very hard to find the best concept for the object configurations for the specific Nagios environment. So if you start from scratch with your Nagios installation without any previous knowledge about Nagios I strongly recommend to plan some time for a redesign of the complete configuration after some time.

If you learned you first lessons using Nagios you will recognize there are several ways to make your Nagios configuration more flexible and more compact that it can be administrated easier. At this stage most professional Nagios users define some laws for the specific Nagios installations. These rules may begin with object naming concepts and end up with complex object management standards.

Just some basic topics you should think about when optimizing your object configurations:

  • Use templates (Object Inheritance) to assign common options
  • Use templates assign groups
  • Use hostgroups to assign services to a group of hosts
  • Use default wildcards like the * or ! to match objects when assigning to each other
  • Use real regular expressions to assign objects to each other

The above listed sentences are just some examples in which ways Nagios configurations may be tuned. There are a lot of other options you have.

Nagios Configuration Tools

There are a lot of different other approaches to make the configuration of Nagios easier. Most of these solutions are web based configuration tools. All these tools have the disadvantage that they cut down the choices what and how to handle the configuration in different ways.

Most of the Nagios Configuration Utitilies use own databases where the Nagios Configuration is being modified. When a modification has been finished the configuration is parsed from the database into Nagios Configuration Files. After that the Nagios process is being restarted with the new configuration. This adds the disadvantage that modifications to that configuration files will be overwritten each time the configuration is parsed. So even experienced Nagios administrators need to use that web based configuration utility.

For some Nagios configuration utilities take a look at the Nagios Addons page.

Filed under: Nagios
1
Jan/10
0

Offer: Nagios Plugin Development

It is not a really hard job to develop any Nagios plugin. But developing good Nagios plugins which fit the special needs and requirements while being fast and only gathering the needed information requires some more experience and skills.

22
Dec/09
3

NDOUtils: Nagios Data Out

There are a lot of cases and ways the NEB modules can be used for. For example there is a very popular Nagios Data Out (NDO) database abstraction layer. The NDOutils have been developed and are distributed by the Nagios Core Team.

The NDOutils can be separated into two components. The ndomod is the NEB module which is loaded into the Nagios process. It forwards the information to listening daemons using tcp or unix sockets. The second component is a listening daemon. The most popular daemon is ndo2db. It retrieves the information from ndomod and pushes it to a relational database.

21
Dec/09
1

Nagios Event Broker

The Nagios Event Broker (NEB) is a way Nagios makes internal information available for external libraries. NEB is based on shared code libraries which are called modules. The NEB modules are hooked into the Nagios core process when starting Nagios. The NEB uses callback routines which have to be served by the NEB modules. These routines are executed when special events occur in the Nagios server process.