Updated 01 April 2020
Linux Logo

How To Create and Implement a Custom Remote nrpe Plugin


Creating Custom Remote nrpe plugin

This procedures describes how to create a remote nrpe plugin for Nagios

  • create a nrpe script/command to generate proper format output and exit codes 
    • Install command/script in /usr/lib64/nagios/plugins on the client
  • on client server identify that this command exist (/etc/nagios/nrpe.cfg)
  • On the Nagios server, 
    • add the new command to commands.cfg. For remote exection, the command must be called with check_nrpe.
    • add a new service to the services.cfg
    • define a new hostgroup in the hostgroups.cfg for the host that at to be monitored.

Problem

Monitoring a process on a client server requires excution locally of the command. For example to test is ldap is running requires the execution of the ldapsearch command which is not installed on the nagios server. Hense an executable plugin must be created that only runs on the client server. 

Creating nrpe Script

A nrpe script or program must generate an exit code (return code) and description. The exit codes are as follows:

nrpe Exit codes (return codes)
Numerical
Value
Service
Status
Status Description
 0  OK  The plugin was able to check the service and it appeared to be functioning properly
 1  Warning  The plugin was able to check the service, but it appeared to be above some "warning" threshold or did not appear to be working properly
 2  Critical  The plugin detected that either the service was not running or it was above some "critical" threshold
 3  Unknown  Invalid command line arguments were supplied to the plugin or low-level failures internal to the plugin (such as unable to fork, or open a tcp socket) that prevent it from performing the specified operation. Higher-level errors (such as name resolution errors, socket timeouts, etc) are outside of the control of plugins and should generally NOT be reported as UNKNOWN states.
  • The EXIT CODE is generated by the execution of the script
  • The DESCRIPTION is standard output text

Template

This is a template bash shell script. Script should be createdon the client in the /usr/lib64/nagios/plugins directory. Typically with not suffix.

!/bin/bash
#==============================================================================
# Copyright: LogiQwest 2017
# Name: check_template
# OS: Linux
# Location: /usr/lib64/nagios/plugins
# Purpose: Check nrpe template
# License: This template is provide free of charge with no warrenty or support. 
# User are free to modify and distribute freely with no restrictions. #------------------------------------------------------------------------------ # Change history: # Version 1.00:05 Mar 2017 Created #============================================================================== VERSION='1.00' #Association ARGUMENT if required ARG=$1 <insert shell script commands to verify an operations and create a description> <and a resuls. For example DESCRIPTION="something" and RESULT="ok|warning|critical|unknown"> case "${RESULTS}" in 'ok') echo "OK- ${DESCRIPTION}" exit 0 ;; 'warning') echo "WARNING- ${DESCRIPTION}" exit 1 ;; 'failed') echo "CRITICAL- ${DESCRIPTION}" exit 2 ;; 'unknown') echo "UNKNOWN- ${DESCRIPTION}" exit 3 ;; esac

Example Script

#!/bin/bash
#==============================================================================
# Copyright: Logiqwest 2017
# Name: check_ldap_replication.sh
# OS: Linux
# Location: /usr/lib64/nagios/plugins
# Purpose: Check ldap replication against master 
# License: This script is provide free of charge with no warrenty or support. 
# User are free to modify and distribute freely with no restrictions. #------------------------------------------------------------------------------ # Change history: # Version 1.00:03 Mar 2017 Created by Michael Barto #============================================================================== VERSION='1.00'
MASTER_LDAP_SERVER=$1
 SLAVE_LDAP_SERVER=`hostname`
RUN_LDAP_REPLICATION_CHECK ()
{
      master_contextCSN=`ldapsearch -x -D "cn=Administrator,dc=freightgate,dc=com" -w '<PASSWORD>' -H ldaps://${MASTER_LDAP_SERVER}:636 -P 3 -s base -b "dc=freightgate,dc=com" contextCSN | grep contextCSN | awk '{print $NF}' | grep -v contextCSN`
      TEST_RUN=`echo $?`
      if [ $TEST_RUN -eq 0 ]; then 
          slave_contextCSN=`ldapsearch -x -D "cn=Administrator,dc=freightgate,dc=com" -w '<PASSWORD>' -H ldaps://${SLAVE_LDAP_SERVER}:636 -P 3 -s base -b "dc=freightgate,dc=com" contextCSN | grep contextCSN | awk '{print $NF}' | grep -v contextCSN`
          if $TEST_RUN -eq 0 ]; then
              DESCRIPTION="${SLAVE_LDAP_SERVER} ${slave_contextCSN}"
              if [[ "${slave_contextCSN}" != "${master_contextCSN}" ]]; then
                  RESULTS='failed'
              else
                  RESULTS='ok'
              fi
              DESCRIPTION="${SLAVE_LDAP_SERVER} ${slave_contextCSN}"
           else
              RESULTS='unknown'
              DESCRIPTION="${SLAVE_LDAP_SERVER} unknown" 
           fi
      else 
          RESULTS='unknown'
          DESCRIPTION="${SLAVE_LDAP_SERVER} unknown"
      fi
 }
 
 OUTPUT_RESULTS ()
 {
      case "${RESULTS}" in
         'ok')
               echo "OK- ${DESCRIPTION} 0"
               exit 0
               ;;
         'warning')
               echo "WARNING- ${DESCRIPTION} 1"
               exit 1
               ;;
         'failed')
               echo "CRITICAL- ${DESCRIPTION} 2"
               exit 2
               ;;
         'unknown')
               echo "UNKNOWN- ${DESCRIPTION} 3"
               exit 3
               ;;
       esac
 }
# Main ----------------------------------------------------------
RUN_LDAP_REPLICATION_CHECK
OUTPUT_RESULTS

 Edit nrpe.cfg on the Nagios client

Add script plugin command definition to /etc/nagios/nrpe.cfg. For example, the following was added at the end of the file:

.... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... .....
.... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... .....
.... ..... ..... ..... ..... ..... ..... ..... ..... ..... ..... .....
# config file is set to '1'.  This poses a potential security risk, so
# make sure you read the SECURITY file before doing this.
#command[check_users]=/usr/lib64/nagios/plugins/check_users -w $ARG1$ -c $ARG2$ 
#command[check_load]=/usr/lib64/nagios/plugins/check_load -w $ARG1$ -c $ARG2$ 
#command[check_disk]=/usr/lib64/nagios/plugins/check_disk -w $ARG1$ -c $ARG2$ -p $ARG3$
#command[check_procs]=/usr/lib64/nagios/plugins/check_procs -w $ARG1$ -c $ARG2$ -s $ARG3$ # To check if ldap replication is working command [check_ldap_replication]=/usr/lib64/nagios/plugins/check_ldap_replication $ARG1$

Restart nrpe on the client.

[root@dbserv14v fg_root]# service xinetd stop
Stopping xinetd:                                           [  OK  ]
[root@dbserv14v fg_root]# service nrpe restart
Shutting down Nagios NRPE daemon (nrpe):                   [  OK  ]
Starting Nagios NRPE daemon (nrpe):                        [  OK  ]
[root@dbserv14v fg_root]# service xinetd start
Starting xinetd:                                           [  OK  ]
[root@dbserv14v fg_root]# 

Checking Script

On the nagios server perform the following test of the new script from the Nagious Server plugin directory (/usr/lib64/nagios/plugins) using check_nrpe

[root@nagios plugins]# /usr/lib64/nagios/plugins/check_nrpe -H ldap001v -c check_ldap_replication ldap101v.freigtgate.com
OK- ldap001v.idc.freightgate.com 20170402011858Z#000000#00#000000 0
[root@nagios plugins]#

Enabling the New Command on the Nagios Server

On the Nagios Server:

  • Edit commands.cfg
  • Edit services.cfg
  • Add a hostgroup by editing hostgroup.cfg

commands.cfg

Add the following command type to commands.cfg. For remote execution you must execute the command via the check_nrpe command do execution on the remove server or the command will be excuted on the local nagios server.

# 'check_ldap_replication' command definition
define command{
        command_name    check_replication_ldap
        command_line    $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_ldap_replication $ARG1$
        }

services.cfg

Add the following command type to services.cfg. Note that parameter values was passed by using "!" to separate the command from the parameter.

define service{
        use                             generic-service
        hostgroup_name                  ldap_slaves
        service_description             LDAP REPLICATION
        check_command                   check_replication_ldap!ldap101v.logiqwest.com
        }

hostgroup.cfg

Add the following command type to hostgroups.cfg. You may need to defined another host in host.cfg

define hostgroup {
        hostgroup_name  ldap_slaves
        alias   LDAP Slaves
        members ldap001v,ldap102v
}

Restart Nagios

Check the updates to the Nagios server with 

[root@nagios CHECK_PROGRAMS]# ./check_nagios_configuration.sh 
Nagios Core 4.0.8
Copyright (c) 2009-present Nagios Core Development Team and Community Contributors
Copyright (c) 1999-2009 Ethan Galstad
Last Modified: 08-12-2014
License: GPL
Website: http://www.nagios.org
Reading configuration data...
   Read main config file okay...
   Read object config files okay...
Running pre-flight check on configuration data...
Checking objects...
Warning: Service 'Check if Amadeus service is up and operational.' on host 'www.amadeus.net'
Checked 430 services.
Checked 99 hosts.
Checked 30 host groups.
Checked 2 service groups.
Checked 2 contacts.
Checked 2 contact groups.
Checked 69 commands.
Checked 5 time periods.
Checked 0 host escalations.
Checked 0 service escalations.
Checking for circular paths...
Checked 99 hosts
Checked 0 service dependencies
Checked 0 host dependencies
Checked 5 timeperiods
Checking global event handlers...
Checking obsessive compulsive processor commands...
Checking misc settings...
Total Warnings: 1
Total Errors:   0
Things look okay - No serious problems were detected during the pre-flight check
[root@nagios CHECK_PROGRAMS]# 

Correct any error and then restart nagios

[root@nagios CHECK_PROGRAMS]# service nagios restart
 Running configuration check...
 Stopping nagios: done.
 Starting nagios: done.
 You have new mail in /var/spool/mail/root
 [root@nagios CHECK_PROGRAMS]# 

 Verify with Nagios Web Interface

Verify that is its working in the Nagios Web Interface

Additional Testing

Make one of the client fail by editing the remote script to generate CRITICAL

 

Click for more info