I have a monitoring service Zabbix which dies every few weeks, because it’s MySQL tables were locked for too long during a backup… Annoying! mostly because it’s then dead unnoticed for not just a few minutes. So, how do you monitor a monitoring service? Or simply… How do you restart any service that has just gone away in a simple way?
I recently came across monit. They state it’s up and running in just 15min. I got it faster.
You can also monitor network availability, application availability, file permissions and system utilization… I think this tool is really great for a small network, though I don’t think it would scale that well. Just give it a try.