Using monit Tool to Monitor Asterisk

Tom Keating : VoIP & Gadgets Blog
Tom Keating
CTO
| VoIP & Gadgets blog - Latest news in VoIP & gadgets, wireless, mobile phones, reviews, & opinions

Using monit Tool to Monitor Asterisk

Your IP-PBX is one of the most critical pieces of corporate infrastructure. It cannot afford any downtime, which is why the fives 9's (99.999%) of reliability was coined. While Asterisk is a pretty stable open source IP-PBX platform, it it still in its infancy, so it hasn't had the same time that the old 'Big Iron PBXs' have had to reach five 9s of reliability. Then again, many traditional PBX manufacturers have abandoned 100% proprietary hardware and use many of the same standard off the shelf components that are in Asterisk, including motherboards, memory, processors, etc. So the old wives tale that big iron PBXs are more reliable than PC-based PBXs no longer applies.

trixbox-logo.jpg Still, Asterisk and all of its derivatives (trixbox CE/Pro, PBX in a Flash, etc.) have a cult following (of which I'm a member) -- and like any cult, we like to do crazy things, like tweak Asterisk or trixbox in the middle of the work day to see if some newfangled text-to-speech feature will work. Well, with so much tweaking by some Asterisk cultists, something is bound to go wrong, usually at the end of the work day on a Friday when you're driving home, forcing a return to the office or waiting to you get home and SSH into Asterisk to restart the service.

So how do we ensure a more reliable Asterisk platform using an automated tool? Surely there must be a way of monitoring the Asterisk service and if it crashes, automatically restart it, right? Ever second is precious when you're trying to achieve 5 9s of reliability, which equates to 5 minutes, 15 seconds or less of downtime in a year. Or if you want to get really crazy, shoot for 6 nines of reliability (99.9999%) which is 31.536s of downtime per year!

monit-logo.jpg Well, before we continue, you must remember that Asterisk runs on Linux and there are many great monitoring tools for Linux. In fact, for the blog web server you're reading this article on, I'm running a free monitoring tool aptly called monit, which you can get here.  This tool is so easy to use, it should be in any Linux admin's arsenal. I use it to monitor various parameters of the blog server, and if certain conditions are met, it automatically restarts the apache web service.

It got me thinking, "Why not use monit to monitor Asterisk?" Well, here's how to do it!

1) Install monit.
2) Simple way: Run 'yum install monit' or run 'apt-get install monit' Go to Step
3) Compile/Harder way: Go here: http://mmonit.com/monit/download/ and download the .tar file, currently called monit-5.0.tar.gz
4) Untar monit
# tar -zxvf monit-5.0.tar.gz
# cd monit-5.0
Configure and compile monit:
# ./configure
# make
5) Install monit
# make install
6) Copy monit configuration file to /etc/ folder
# cp monit.conf /etc/monit.conf (older versions used monitrc filename)
7) Edit monit.conf & put in your monitoring rules (see examples below)
8) Add monit service to the startup. Red Hat command follows:
# chkconfig --add monit
# chkconfig --level 2345 monit on
# {confirm the run levels}
# chkconfig --list|grep monit

It is super easy it to setup the mail server for notifications and to configure monitoring of processes, files, loads (CPU, memory), and ports. And of course, using monit you can monitor Asterisk, trixbox CE or Pro, PBX in a Flash, and other IP-PBXs that run on Linux.

Here's a snippet from two monit.conf configuration files (one the blog server, the other Asterisk):
###############################################################################
##
## Start monit in background (run as daemon) and check the services at 2-minute
## intervals.
#
set daemon  120 # can set lower if want downtime <2min
set mailserver mail.tmcnet.com     # primary mailserver
## You can set the alert recipients here, which will receive the alert for
## each service. The event alerts may be restricted using the list.
#
  set alert [email protected]          # receive all alerts
  set alert [email protected]
  check system blog.tmcnet.com
    if loadavg (1min) > 4 then alert
    if loadavg (5min) > 2 then alert
    if memory usage > 75% then alert
    if cpu usage (user) > 70% then alert
    if cpu usage (system) > 30% then alert
    if cpu usage (wait) > 20% then alert
  check process apache with pidfile /var/run/httpd.pid
    start program = "/etc/init.d/httpd start"
    stop program  = "/etc/init.d/httpd stop"
    if cpu > 60% for 2 cycles then alert
    if cpu > 80% for 25 cycles then restart
    if totalmem > 1300.0 MB for 5 cycles then restart
    if children > 250 then restart
    if loadavg(5min) greater than 10 for 8 cycles then stop
    if failed host blog.tmcnet.com port 80 protocol http
       and request "/monit/doc/next.php"
       then restart
    if failed port 443 type tcpssl protocol http
       with timeout 15 seconds
       then restart
    if 3 restarts within 5 cycles then timeout
    depends on apache_bin
    group server

# Asterisk Monitoring rule
set daemon 30 # Check every 30s
set logfile syslog facility log_daemon
set alert [email protected]
check process asterisk with pidfile /var/run/asterisk/asterisk.pid
group asterisk
start program = "/etc/init.d/asterisk start"
stop program = "/etc/init.d/asterisk stop"
# Check uptime via Asterisk Manager Interface (AMI) port 5038
if failed host 127.0.0.1 port 5038 then restart
if 5 restarts within 5 cycles then timeout

#Check Veritas BackupExec Agent
check host blog.domain.com with address 192.0.0.6
start program = "/etc/init.d/VRTSralus.init start"
#stop program = "/etc/init.d/VRTSralus.init stop"
if failed port 10000 with timeout 35 seconds then restart
Further, you can even test the SIP protocol, which uses port 5060. The SIP test is similar to other protocol tests that monit supports, however, it allows extra optional parameters.

IF FAILED [host] [port] [type] PROTOCOL sip [AND] [TARGET valid@uri] [AND] [MAXFORWARD n] THEN action [ELSE IF SUCCEEDED [[<X>] <Y> CYCLES] THEN action]

TARGET : you may specify an alternative recipient for the message, by adding a valid sip uri after this keyword.

MAXFORWARD : Limit the number of proxies or gateways that can forward the request to the next server. It's value is an integer in the range 0-255, set by default to 70. If max-forward = 0, the next server may respond 200 OK (test succeeded) or send a 483 Too Many Hops (test failed)

SIP examples:
  check host openser_all with address 127.0.0.1
   if failed port 5060 type udp protocol sip
      with target "localhost:5060" and maxforward 6
   then alert
 
  check host sip.broadvoice.com with address sip.broadvoice.com
   if failed port 5060 type tcp protocol SIP
      and target [email protected] maxforward 10
   then alert

Now that you know how to automatically monitor Asterisk, trixbox, PBX in a Flash, etc. those five nines (6?) of reliability are just around the corner. As the PBX administrator / telecom manager, you will be worshipped by your sales team star-trek-who-mourns-for-adonais.jpg and boss for keeping the phone system up all the time. They will think you an Asterisk God, who will be adored and who shall command great respect and admiration. And none shall mourn for any Asterisk outages.


Related Articles to 'Using monit Tool to Monitor Asterisk'
adtran-ip-706.jpg
webrtc2sip-click-to-call.png
allison-smith.jpg

Featured Events