If you have never used it before, supervisor is an application that runs on your server to monitor a number of different applications. With supervisord, you get all of the benefits of turning long-running applications into daemons without all the extra code required to make that happen natively.
The configuration for supervisord is extremely flexible, and one of the things that it will allow you to do is go and configure how many times a process will get restarted, the minimum amount of time an application should be running and what the action should be if your long-running process crashes (i.e., should it restart, or do nothing?).
An example configuration for a project that is monitored by supervisors:
Inside the configuration files of Supervisor, you specify the process that you want to monitor, how many copies of the same program that you want to run, and then reload supervisor. If that program crashes, supervisor will detect the signal from the child process it spawned and will go and restart the process for you. No more wondering if your process is still running, and no more running your applications in screen.
Also, there is no more second guessing if your application is still running. Almost, that is.
The program will follow all the rules that you specify for restarting processes.
Luckily, Supervisor comes backed in with a web server, which is configured by setting up the
[inet_http_server] section inside
/etc/supervisord.conf. By giving Supervisor an IP address and port to listen to, you will get an integrated interface where all of the processes that are currently running are listed.
Which is fine for development, but when your running long-running apps in production, normally you would also want to have those processes monitored to ensure that there are no issues with the availability of your application.
Queue the nagios-supervisord-processes repository.
check_supervisor is a plugin for use with Nagios that uses supervisor’s XML-RPC API that is built into the same web server that is provided by Supervisord to check the status of any processes that are set up to be running inside supervisord.
If you have nagios checks already being performed, configuration is mind-numbingly simple to get up and running.
Install the script into your user scripts directory inside Nagios, add a command template and finally add one service for each program or service you want to monitor.
The nagios check will automatically find out if any of the processes started are not running and alert your system administrators accordingly.
Check out the project over at GitHub and open up an issue if you come across any issues or see any bugs.