check_esx4_storage
On this page you will find the current version of the ESX4 storage check plugin
Idea
We are using the VMware ESXi 4 server on DL380G5 hardware hosted on a USB stick using the internal P400 controller for serving the storage for the Virtual Machines. The solution works very good for us since version 3. Short after the release of version 3 there were some different Nagios plugins released for checking the health of the ESX host and the virtual machines.
It was not possible for us to fetch those information from the ESXi 3. So we had to wait for the release of ESXi 4. And in fact: It is now possible to fetch the storage information from the ESXi.
Solution
<
p>To bring the status information to Nagios I wrote a small Nagios plugin which uses the VMware Infrastructure Perl Toolkit to gather those information from the ESXi servers.
Prerequisites
You will need the VMware Infrastructure Perl Toolkit have installed on your Nagios server to get the plugin working. I installed VIPerl with the howto included in check_esx3 from op5:
Download the latest version of Perl Toolkit from VMware support page. In this example we use VMware-VIPerl-1.6.0-104313.i386.tar.gz, but the instructions should apply to newer versions as well. Upload the file to your Nagios server’s /root dir and execute: cd /root tar xvzf VMware-VIPerl-1.6.0-104313.i386.tar.gz cd vmware-viperl-distrib/ ./vmware-install.pl Follow the on screen instructions, described below: “Creating a new VMware VIPerl Toolkit installer database using the tar4 format. Installing VMware VIPerl Toolkit. You must read and accept the VMware VIPerl Toolkit End User License Agreement to continue. Press enter to display it.”“Read through the License Agreement” “Do you accept? (yes/no) yes “In which directory do you want to install the executable files? [/usr/bin]” The following Perl modules were found on the system but may be too old to work with VIPerl: Crypt::SSLeay Compress::Zlib The installation of VMware VIPerl Toolkit 1.6.0 build-104313 for Linux completed successfully. You can decide to remove this software from your system at any time by invoking the following command: /usr/bin/vmware-uninstall-viperl.pl. Enjoy, –the VMware team Note: “Crypt::SSLeay” and “Compress::Zlib” are not needed for check_esx3 to work.
check_esx4_storage
Copy the file to your nagios/libexec directory, fix owner and make it executable.
#!/usr/bin/perl # ############################################################################## # 2009-10-20 Lars Michelsen <lars@vertical-visions.de> # # This program is free software; you can redistribute it and/or # modify it under the terms of the GNU General Public License # as published by the Free Software Foundation; either version 2 # of the License, or (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, # # GNU General Public License: http://www.gnu.org/licenses/gpl-2.0.txt # # ############################################################################## # SCRIPT: check_esx4_storage.pl # VERSION: 1.0 # AUTHOR: Lars Michelsen # DECRIPTION: Checks the storage health status in VMWare ESX4 servers using # the VMware VIPerl toolkit. The script has been written for # checking HP DL380g5 server with built-in P400 controller # Inspired by the Hardware.pl found on # <http://communities.vmware.com/docs/DOC-10665> # BUGS: Please report bugs on <http://www.nagios-portal.org> # CHANGES: # 2009-10-20 v1.0 Initial code # ############################################################################## use strict; use warnings; use VMware::VILib; use WSMan::StubOps; $Util::script_version = "1.0"; # # Nagios specific definitions # my %ERRORS = ('OK' => 0, 'WARNING' => 1, 'CRITICAL' => 2, 'UNKNOWN' => 3); my %ERRORCODES = (0 => 'OK', 1 => 'WARNING', 2 => 'CRITICAL', 3 => 'UNKNOWN'); my %HEALTHSTATUS2NAGIOSCODE = ('Unknown' => 3, 'OK' => 0, 'Degraded/Warning' => 1, 'Minor failure' => 1, 'Major failure' => 2, 'Critical failure' => 2, 'Non-recoverable error' => 2); my $output = ''; my $perfdata = ''; my $exitCode = 0; # # VMWare API definitions # my @classes = ("VMware_Controller","VMware_StorageExtent","VMware_StorageVolume","VMware_SASSATAPort"); my %healthstatus=(0 => "Unknown", 5 => "OK", 10 => "Degraded/Warning", 15 => "Minor failure", 20 => "Major failure", 25 => "Critical failure", 30 => "Non-recoverable error"); my %hardwaregroup=("VMware_Controller" => "Storage", "VMware_StorageExtent" => "", "VMware_StorageVolume" => "", "VMware_SASSATAPort" => ""); my @operationalstatus = ("Unknown", "Other", "OK", "Degraded", "Stressed", "Predictive Failure", "Error", "Non-Recoverable Error", "Starting", "Stopping", "Stopped", "In Service", "No Contact", "Lost Communication", "Aborted", "Dormant", "Supporting Entity in Error", "Completed", "Power Mode", "DMTF Reserved", "Vendor Reserved"); # General variable Declaration my $client; my %opts = ( namespace => { type => "=s", help => "Namespace for all queries. Default is :root/cimv2", required => 0, default => "root/cimv2", }, timeout => { type => "=s", help => "Default http timeout for all the queries. Default is 120", required => 0, default => "120" } ); Opts::set_option('protocol', 'http'); Opts::set_option('servicepath','/wsman'); Opts::set_option('portnumber', '80'); Opts::add_options(%opts); Opts::parse(); # validate() would use STDIN for input of username and password # This should not be done. Instead print the usage and terminate if(!Opts::get_option('username') || !Opts::get_option('password')) { print "ERROR: The options username or password are not set\n"; Opts::usage(); exit($ERRORS{UNKNOWN}) } Opts::validate(); ################################################################################ # Main ################################################################################ # Connect to ESX host createConnection(); # Get hardware information my @hw = @{getStorageHardware()}; # Catch no hardware information error if($#hw <= 0) { $output = 'No storage Hardware information found'; $exitCode = $ERRORS{UNKNOWN}; } # Loop all hardware devices and build the output string my $elemCode = 0; foreach my $hw (@hw) { # DEBUG: #print $hw->{instanceName}."\n"; #print $hw->{elementName}."\n"; #print $hw->{healthStatus}."\n"; #print $hw->{operationalStatus}."\n"; # Translate VMware health status to Nagios status code $elemCode = $HEALTHSTATUS2NAGIOSCODE{$hw->{healthStatus}}; # Build summary output $output .= $ERRORCODES{$elemCode} . ': '. $hw->{elementName}."\n"; # Build summary status if($elemCode > $exitCode) { $exitCode = $elemCode; } } # Print the Nagios output if($perfdata ne '') { $output .= ' | '.$perfdata } print $ERRORCODES{$exitCode}. ': Summary status is ' . $ERRORCODES{$exitCode} . ". " . "For details take a look at the long output.\n" . $output . "\n"; exit($exitCode); ################################################################################ # Subs ################################################################################ sub getStorageHardware { my @ret = (); my $healthStatus = ""; my $operationalStatus = ""; my $instanceName = ""; my $elementName = ""; # Loop all classes which should be queried foreach my $class (@classes) { # Read all instances of the class my @details = $client->EnumerateInstances(class_name => $class); # Loop all elements in the instance foreach my $elem (@details) { # Don't handle empty elements if($elem && $elem ne "") { # Instance name is the type of the object $instanceName = (keys(%{$elem}))[0]; # Display Name of the element # # e.g. # HP Smart Array P400 Controller : HPSA1 # Disk 1 on HPSA1 : Port 1I Box 1 Bay 8 : 136GB : Spare Disk $elementName = $elem->{$instanceName}->{ElementName}; # Shorten the display name for nice output #if(length($elementName) gt 43) { # $elementName = substr($elementName, 0, 40); # $elementName = $elementName . "..."; #} # Health information available? # When it is: Gather the status code and translate to VMware status description if($elem->{$instanceName}->{HealthState} && exists $healthstatus{$elem->{$instanceName}->{HealthState}}) { $healthStatus = $healthstatus{$elem->{$instanceName}->{HealthState}}; } else { $healthStatus = "Unknown"; } # Operational status available? # When it is: Gather the status code and translate to VMware status description if($elem->{$instanceName}->{OperationalStatus} && $elem->{$instanceName}->{OperationalStatus} <= (scalar(@operationalstatus)-1)) { $operationalStatus = $operationalstatus[$elem->{$instanceName}->{OperationalStatus}]; } else { $operationalStatus = "Unknown"; } push(@ret, {'instanceName' => $instanceName, 'elementName' => $elementName, 'healthStatus' => $healthStatus, 'operationalStatus' => $operationalStatus}); } } } return \@ret; } sub createConnection { # Set the connection parameters from the environment my %args = ( path => Opts::get_option ('servicepath'), username => Opts::get_option ('username'), password => Opts::get_option ('password'), port => Opts::get_option ('portnumber'), address => Opts::get_option ('server'), namespace => Opts::get_option('namespace'), timeout => Opts::get_option('timeout') ); # Create the connection object in the client. $client = WSMan::GenericOps->new(%args); # Register extra CIM namespaces that the WS-Management server might require. $client->register_class_ns(OMC => 'http://schema.omc-project.org/wbem/wscim/1/cim-schema/2', VMware => 'http://schemas.vmware.com/wbem/wscim/1/cim-schema/2', ELXHBA => 'http://schemas.emulex.org/wbem/wscim/1/cim-schema/2'); }
Sample output
The simplest way to use the script is to call it like this:
# ./check_esx4_storage.pl --server esx4i-test.mydomain.com --username monitoring --password <PASSWORD>The output on my test system looks like this:
OK: Summary status is OK. For details take a look at the long output. OK: HP Smart Array P400 Controller : HPSA1 OK: Disk 1 on HPSA1 : Port 1I Box 1 Bay 8 : 136GB : Spare Disk OK: Disk 2 on HPSA1 : Port 1I Box 1 Bay 7 : 136GB : Data Disk OK: Disk 3 on HPSA1 : Port 1I Box 1 Bay 6 : 136GB : Data Disk OK: Disk 4 on HPSA1 : Port 1I Box 1 Bay 5 : 136GB : Data Disk OK: Disk 5 on HPSA1 : Port 2I Box 1 Bay 4 : 136GB : Data Disk OK: Disk 6 on HPSA1 : Port 2I Box 1 Bay 3 : 136GB : Data Disk OK: Disk 7 on HPSA1 : Port 2I Box 1 Bay 2 : 136GB : Data Disk OK: Disk 8 on HPSA1 : Port 2I Box 1 Bay 1 : 136GB : Data Disk OK: Logical Volume 1 on HPSA1 : RAID 5 : 820GB : Disk 2,3,4,5,6,7,8,1
Since this plugin uses multiline output only the line “OK: Summary status is OK. For details take a look at the long output.” will be shown on the status overview page. The long output including all the lines is only shown on the service detail page.








11:51 on October 26th, 2009
Great script! Thannk you!
For debian/ubuntu, you ned to install libcrypt-ssleay-perl,libsoap-lite-perl,libuuid-perl,libdata-dump-perl
Tested on a ml370G6 + P410i
14:18 on January 17th, 2011
After vSphere Update to 4.1, we get the following error: “401 Unauthorized at /usr/share/perl/5.10/WSMan/WSBasic.pm line 199″
Has anyone tried the Plugin after the Update to vSphere 4.1?
08:18 on March 3rd, 2011
Is there a way to run as a non administrative user? I keep getting error “401 Unauthorized at /usr/share/perl/5.10/WSMan/WSBasic.pm line 199″
10:36 on July 15th, 2011
Hi,
Nice Plugin but sometimes I get the follow Error:
Additional Info:
(Return code of 104 is out of bounds)
Any Idea?
09:45 on August 9th, 2011
Hi, getting the same Error with vSphere 4.1/ connecting to single ESX server: 500 Can’t connect to esx-****:443 (certificate verify failed) at /usr/lib/perl5/5.10.0/WSMan/WSBasic.pm line 199
Is there away to get around this?
Best Regards, T.
14:34 on August 24th, 2011
set this in your Perl Script
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
14:36 on August 24th, 2011
write this in your Perl Script
$ENV{PERL_LWP_SSL_VERIFY_HOSTNAME} = 0;
09:12 on November 29th, 2011
Hi!
I’m tryin to monitor a DL 380 G4, running on ESXi 4.1.0 (348481), but i only get an “UNKNOWN: Summary status is UNKNOWN. No storage Hardware information found” from the plugin.
On another Server (DL380 G7) everything works fine.
Do you have any suggestions? Do I have to configure anything to be able to read the storage status information?
Best Regards, M. van der Kamp
19:14 on December 1st, 2011
Mhm. Have no real idea. Maybe there is no support for the G4 in the VI perl toolkit.
16:07 on December 6th, 2011
Hello lami
i tried to monitor our hp proliant d460 g1 machines with esxi 5.0
./check_esx4_storage.pl –server es.li.gov –username xxx –password xxxx
but there comes the folowing error:
UNKNOWN: Summary status is UNKNOWN. For details take a look at the long output. No storage Hardware information found
is the script still working with esxi 5.0?
best regards
08:33 on December 8th, 2011
Hi, thank you for the quick feedback. I just checked the sensor overview in the hardware tabs inside vsphere client, there are no informations about the storage – so i guess when I see no informations there, than I just can’t read them with the VI perl toolkit.
Since the server is not on the compatibility list, this is no surprise…
Thank you anyway! Michael
21:35 on December 29th, 2011
Has anyone tried this script with ESXi 5? We’ve migrated a couple boxes here which worked at version 4, but not now. (I know it says “esx4″ right in the name.) I’m just curious if anyone’s made it work with 5.