kavin

Nagios利用NRPE监控Linux主机

kavin 运维技术 2022-11-20 615浏览 0

一、简介

1、NRPE介绍

NRPE是Nagios的一个功能扩展,它可在远程Linux/Unix主机上执行插件程序。通过在远程服务器上安装NRPE插件及Nagios插件程序来向Nagios监控平台提供该服务器的本地情况,如CPU负载,内存使用,磁盘使用等。这里将Nagios监控端称为Nagios服务器端,而将远程被监控的主机称为Nagios客户端。

Nagios监控远程主机的方法有多种,其方式包括SNMP,NRPE,SSH,NCSA等。这里介绍其通过NRPE监控远程Linux主机的方式。

NRPE(Nagios Remote Plugin Executor)是用于在远端服务器上运行监测命令的守护进程,它用于让Nagios监控端基于安装的方式触发远端主机上的检测命令,并将检测结果返回给监控端。而其执行的开销远低于基于SSH的检测方式,而且检测过程不需要远程主机上的系统账号信息,其安全性也高于SSH的检测方式。

Nagios利用NRPE监控Linux主机

2、NRPE的工作原理

NRPE有两部分组成

check_nrpe插件:位于监控主机上

nrpe daemon:运行在远程主机上,通常是被监控端agent

注意:nrpe daemon需要Nagios-plugins插件的支持,否则daemon不能做任何监控

Nagios利用NRPE监控Linux主机

详细的介绍NRPE的工作原理

当Nagios需要监控某个远程Linux主机的服务或者资源情况时:

首先:Nagios会运行check_nrpe这个插件,告诉它要检查什么;

其次:check_nrpe插件会连接到远程的NRPE daemon,所用的方式是SSL;

然后:NRPE daemon 会运行相应的Nagios插件来执行检查;

最后:NRPE daemon 将检查的结果返回给check_nrpe 插件,插件将其递交给nagios做处理。

二、被监控端安装Nagios-plugins插件和NRPE

1、添加nagios用户

[root@ClientNrpe~]#useradd-s/sbin/nologinnagios

2、安装nagios-plugins,因为NRPE依赖此插件

[root@ClientNrpe~]#yum-yinstallgccgcc-c++makeopensslopenssl-devel 
 
[root@ClientNrpe~]#tarxfnagios-plugins-2.0.3.tar.gz 
[root@ClientNrpe~]#cdnagios-plugins-2.0.3 
[root@ClientNrpenagios-plugins-2.0.3]#./configure--with-nagios-user=nagios--with-nagios-group=nagios 
[root@ClientNrpenagios-plugins-2.0.3]#make&&makeinstall 
 
#注意:如何要监控mysql需要添加--with-mysql

3、安装NRPE

[root@ClientNrpe~]#tarxfnrpe-2.15.tar.gz 
[root@ClientNrpe~]#cdnrpe-2.15 
[root@ClientNrpenrpe-2.15]#./configure--with-nrpe-user=nagios\ 
>--with-nrpe-group=nagios\ 
>--with-nagios-user=nagios\ 
>--with-nagios-group=nagios\ 
>--enable-command-args\ 
>--enable-ssl 
[root@ClientNrpenrpe-2.15]#makeall 
[root@ClientNrpenrpe-2.15]#makeinstall-plugin 
[root@ClientNrpenrpe-2.15]#makeinstall-daemon 
[root@ClientNrpenrpe-2.15]#makeinstall-daemon-config

4、配置NRPE

[root@ClientNrpe~]#grep-v'^#'/usr/local/nagios/etc/nrpe.cfg|sed'/^$/d' 
log_facility=daemon 
pid_file=/var/run/nrpe.pid 
server_port=5666#监听的端口 
nrpe_user=nagios 
nrpe_group=nagios 
allowed_hosts=192.168.0.105#允许的地址通常是Nagios服务器端 
 
dont_blame_nrpe=0 
allow_bash_command_substitution=0 
debug=0 
command_timeout=60 
connection_timeout=300 
command[check_users]=/usr/local/nagios/libexec/check_users-w5-c10 
command[check_load]=/usr/local/nagios/libexec/check_load-w15,10,5-c30,25,20 
command[check_hda1]=/usr/local/nagios/libexec/check_disk-w20%-c10%-p/dev/hda1 
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs-w5-c10-sZ 
command[check_total_procs]=/usr/local/nagios/libexec/check_procs-w150-c200

5、启动NRPE

#以守护进程的方式启动 
[root@ClientNrpe~]#/usr/local/nagios/bin/nrpe-c/usr/local/nagios/etc/nrpe.cfg-d 
[root@ClientNrpe~]#netstat-tulpn|grepnrpe 
tcp000.0.0.0:56660.0.0.0:*LISTEN22597/nrpe 
tcp00:::5666:::*LISTEN22597/nrpe

有两种方式用于管理nrpe服务,nrpe有两种运行模式:

-i#Runasaserviceunderinetdorxinetd 
-d#Runasastandalonedaemon

可以为nrpe编写启动脚本,使得nrpe以standard alone方式运行:

[root@ClientNrpe~]#cat/etc/init.d/nrped 
#!/bin/bash 
#chkconfig:23458812 
#description:NRPEDAEMON 
 
NRPE=/usr/local/nagios/bin/nrpe 
NRPECONF=/usr/local/nagios/etc/nrpe.cfg 
 
case"$1"in
start) 
echo-n"StartingNRPEdaemon..."
$NRPE-c$NRPECONF-d 
echo"done."
;; 
stop) 
echo-n"StoppingNRPEdaemon..."
pkill-unagiosnrpe 
echo"done."
;; 
restart) 
$0stop 
sleep2 
$0start 
;; 
*) 
echo"Usage:$0start|stop|restart"
;; 
esac 
exit0 
[root@ClientNrpe~]#chmod+x/etc/init.d/nrped 
[root@ClientNrpe~]#chkconfig--addnrped 
[root@ClientNrpe~]#chkconfignrpedon 
 
[root@ClientNrpe~]#servicenrpedstart 
StartingNRPEdaemon...done. 
[root@ClientNrpe~]#netstat-tnlp 
ActiveInternetconnections(onlyservers) 
ProtoRecv-QSend-QLocalAddressForeignAddressStatePID/Programname 
tcp000.0.0.0:220.0.0.0:*LISTEN1031/sshd 
tcp00127.0.0.1:250.0.0.0:*LISTEN1108/master 
tcp000.0.0.0:56660.0.0.0:*LISTEN22597/nrpe 
tcp00:::22:::*LISTEN1031/sshd 
tcp00::1:25:::*LISTEN1108/master 
tcp00:::5666:::*LISTEN22597/nrpe

三、监控端安装NRPE

1、安装NRPE

[root@Nagios~]#tarxfnrpe-2.15.tar.gz 
[root@Nagios~]#cdnrpe-2.15 
[root@Nagiosnrpe-2.15]#./configure 
>--with-nrpe-user=nagios\ 
>--with-nrpe-group=nagios\ 
>--with-nagios-user=nagios\ 
>--with-nagios-group=nagios\ 
>--enable-command-args\ 
>--enable-ssl 
[root@Nagiosnrpe-2.15]#makeall 
[root@Nagiosnrpe-2.15]#makeinstall-plugin 
 
#安装完成后,会在Nagios安装目录的libexec下生成check_nrpe的插件 
[root@Nagios~]#cd/usr/local/nagios/libexec/ 
[root@Nagioslibexec]#ll-dcheck_nrpe 
-rwxrwxr-x.1nagiosnagios767699月2808:07check_nrpe

2、check_nrpe的用法

[root@Nagioslibexec]#./check_nrpe-h 
 
NRPEPluginforNagios 
Copyright(c)1999-2008EthanGalstad(nagios@nagios.org) 
Version:2.15 
LastModified:09-06-2013 
License:GPLv2withexemptions(-lformoreinfo) 
SSL/TLSAvailable:AnonymousDHMode,OpenSSL0.9.6orhigherrequired 
 
Usage:check_nrpe-H<host>[-b<bindaddr>][-4][-6][-n][-u][-p<port>][-t<timeout>][-c<command>][-a<arglist...>] 
 
Options: 
-n=DonouseSSL 
-u=MakesockettimeoutsreturnanUNKNOWNstateinsteadofCRITICAL 
<host>=TheaddressofthehostrunningtheNRPEdaemon 
<bindaddr>=bindtolocaladdress 
-4=useripv4only 
-6=useripv6only 
[port]=Theportonwhichthedaemonisrunning(default=5666) 
[timeout]=Numberofsecondsbeforeconnectiontimesout(default=10) 
[command]=Thenameofthecommandthattheremotedaemonshouldrun 
[arglist]=Optionalargumentsthatshouldbepassedtothecommand.Multiple 
argumentsshouldbeseparatedbyaspace.Ifprovided,thismustbe 
thelastoptionsuppliedonthecommandline. 
 
Note: 
ThispluginrequiresthatyouhavetheNRPEdaemonrunningontheremotehost. 
Youmustalsohaveconfiguredthedaemontoassociateaspecificplugincommand 
withthe[command]optionyouarespecifyinghere.Uponreceiptofthe 
[command]argument,theNRPEdaemonwillruntheappropriateplugincommandand 
sendthepluginoutputandreturncodebackto*this*plugin.Thisallowsyou 
toexecutepluginsonremotehostsand'fake'theresultstomakeNagiosthink 
thepluginisbeingrunlocally.

通过NRPE监控远程Linux主机要使用chech_nrpe插件进行,其语法格式如下:

check_nrpe-H<host>[-n][-u][-p<port>][-t<timeout>][-c<command>][-a<arglist...>] 
 
[root@Nagioslibexec]#./check_nrpe-H192.168.0.81 
NRPEv2.15

3、定义命令

[root@Nagios~]#cd/usr/local/nagios/etc/objects/ 
[root@Nagiosobjects]#vimcommands.cfg 
#增加到末尾行 
definecommand{ 
command_namecheck_nrpe 
command_line$USER1$/check_nrpe-H"$HOSTADDRESS$"-c"$ARG1$"
}

#p#

4、定义服务

[root@Nagiosobjects]#cpwindows.cfglinhost.cfg 
[root@Nagiosobjects]#grep-v'^#'linhost.cfg|sed'/^$/d' 
definehost{ 
uselinux-server 
host_namelinhost 
aliasMyLinuxServer 
address192.168.0.81 
} 
defineservice{ 
usegeneric-service 
host_namelinhost 
service_descriptionCHECKUSER 
check_commandcheck_nrpe!check_users 
} 
defineservice{ 
usegeneric-service 
host_namelinhost 
service_descriptionLoad 
check_commandcheck_nrpe!check_load 
} 
defineservice{ 
usegeneric-service 
host_namelinhost 
service_descriptionSDA1 
check_commandcheck_nrpe!check_hda1 
} 
defineservice{ 
usegeneric-service 
host_namelinhost 
service_descriptionZombie 
check_commandcheck_nrpe!check_zombie_procs 
} 
defineservice{ 
usegeneric-service 
host_namelinhost 
service_descriptionTotalprocs 
check_commandcheck_nrpe!check_total_procs 
}

这里重点说下,Nagios服务端定义服务的命令完全是根据被监控端NRPE中内置的监控命令,如下图所示

Nagios利用NRPE监控Linux主机

5、启动所定义的命令和服务

[root@Nagios~]#vim/usr/local/nagios/etc/nagios.cfg 
#增加一行 
cfg_file=/usr/local/nagios/etc/objects/linhost.cfg

6、配置文件语法检查

[root@Nagios~]#servicenagiosconfigtest 
 
NagiosCore4.0.7 
Copyright(c)2009-presentNagiosCoreDevelopmentTeamandCommunityContributors 
Copyright(c)1999-2009EthanGalstad 
LastModified:06-03-2014 
License:GPL 
 
Website:http://www.nagios.org 
Readingconfigurationdata... 
Readmainconfigfileokay... 
Readobjectconfigfilesokay... 
 
Runningpre-flightcheckonconfigurationdata... 
 
Checkingobjects... 
Checked20services. 
Checked3hosts. 
Checked2hostgroups. 
Checked0servicegroups. 
Checked1contacts. 
Checked1contactgroups. 
Checked26commands. 
Checked5timeperiods. 
Checked0hostescalations. 
Checked0serviceescalations. 
Checkingforcircularpaths... 
Checked3hosts 
Checked0servicedependencies 
Checked0hostdependencies 
Checked5timeperiods 
Checkingglobaleventhandlers... 
Checkingobsessivecompulsiveprocessorcommands... 
Checkingmiscsettings... 
 
TotalWarnings:0 
TotalErrors:0 
 
Thingslookokay-Noseriousproblemsweredetectedduringthepre-flightcheck 
Objectprecachefilecreated: 
/usr/local/nagios/var/objects.precache

7、重新启动nagios服务

[root@Nagios~]#servicenagiosrestart 
Runningconfigurationcheck... 
Stoppingnagios:done. 
Startingnagios:done.

8、打开Nagios web监控页面

1)首先点击【Hosts】查看监控主机状态是否为UP

Nagios利用NRPE监控Linux主机

2)其次点击【Services】查看各监控服务的状态是否为OK

注意:在监控新添加的主机linhost;出现状态为CRITICAL,提示没有那个文件或目录。下面是解决办法

Nagios利用NRPE监控Linux主机

在监控Linhost主机时出现一个CRITICAL的警告,查找解决办法

Nagios利用NRPE监控Linux主机

###被监控端修改NRPE配置文件并重启NRPE服务 
[root@ClientNrpeetc]#vimnrpe.cfg 
command[check_sda1]=/usr/local/nagios/libexec/check_disk-w20%-c10%-p/dev/sda1 
[root@ClientNrpeetc]#servicenrpedrestart 
 
###监控端修改linhost.cfg配置文件并重启nagios和httpd服务 
[root@Nagiosobjects]#vimlinhost.cfg 
#注释:原来这里是hda1,现在修改成sda1 
defineservice{ 
usegeneric-service 
host_namelinhost 
service_descriptionSDA1 
check_commandcheck_nrpe!check_sda1 
} 
[root@Nagios~]#servicenagiosrestart 
Runningconfigurationcheck... 
Stoppingnagios:done. 
Startingnagios:done. 
[root@Nagios~]#servicehttpdrestart 
停止httpd:[确定] 
正在启动httpd:[确定]

再次点击【services】即为刷新页面,查看如下图所示:

Nagios利用NRPE监控Linux主机

继续浏览有关 Linux 的文章
发表评论