Changes

OPS635-lab-nagios

1,573 bytes added, 16:40, 13 January 2020

m

no edit summary

==Investigation 3: Nagios Custom Commands==

* Create a script plugin called ~~check_apache~~ check_sshd that will use systemctl to check the state of your ~~httpd~~ sshd service. If the service is running, return 0. If it is inactive, return 1. If it is failed, return 2. For any other result return 3.* Create a command definition called ~~check_apache_status~~ check_sshd_status that will call the ~~check_apache~~ check_sshd plugin.* Create a new service definition that will use the new command to check the status of your ~~apache~~ sshd service every two minutes, going into a hard-fail state on the third failed check.* Create an event handler script to restart ~~apache~~ sshd if it is inactive. Use the nagios macros to make sure it only tries to restart ~~apache~~ the service on the second failed check (that is, before it goes into a hard-fail state).

* Add notifications similar to those for your other checks (you should be notified if the service goes into a hard-fail state, and the senior admin should be notified if you don't fix it).

==Investigation 4: Nagios Remote Commands==

* Under Construction* Clone your existing VM again. Call the new VM ~~nagiosclone~~nagiosnrpe.<yourdomain>.ops, provide it a static address of your choice, and add it to your DNS server.* Install NRPE on ~~nagiosclone~~nagiosnrpe.* Make sure to modify the NRPE configuration on nagiosnrpe to allow your nagios server to contact it.* Copy your check_sshd plugin to nagiosnrpe, making sure the user account for nrpe can run it. Note you will have to negotiate this with sudo and selinux.* Add a command to your nrpe configuration to allow remote execution of check_sshd.* Start and enable the service, and allow traffic to it through your firewall.* Back on your nagios server, add a new host definition for nagiosnrpe, and add a service that uses nrpe to run the check_sshd plugin on nagiosnrpe.* Ensure that the check runs correctly, then do something to intentionally make it fail (e.g. stop the sshd service), and ensure that that gets recorded too.

==Submission==

~~Demonstrate~~ Upload your lab1.cfg, the nagios configuration from nagiosnrpe, your check_sshd plugin, and your event handler to blackboard. ==Completing The Lab==You have now gained experience using common elements of nagios to monitor machines in your ~~script working~~ network. You have configured hosts that should be monitored, identified services to monitor on ~~a newly installed VM~~them, created contacts and notifications so that administrators will be notified when things to wrong (and senior admins can be notified if they don't get fixed), and ~~upload it~~ used nrpe to ~~blackboard~~allow checks to be performed remotely. You have also written simple checks to customize what you want monitored, and event handlers so that nagios can try to repair simple issues for you. There is still more to learn (host and service groups and dependencies will make your configuration much more efficient), but there is only so much room in the course. With what we have covered you have the basic building blocks to monitor your network.

Peter.callaghan

932

edits

Changes

OPS635-lab-nagios

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools