Five nines or 99.999% has always been an overused and not understood standard for discussing availability. At best the use of 99.999% is often inaccurate, misleading or not applicable but let’s be clear. A network, such as the ANPI private IP network, is designed, built and operated with the goal of delivering 99.999% network availability to our customers. Let’s first understand what the commitment of 99.999% uptime translates to in terms of minutes of downtime per year. There are 525,600 minutes in a year and multiplied against 0.99999 is equal to 525594.744 or a difference of 5.256 minutes. Therefore, a network that is described to be at a level of five nines should have a downtime of 5.256 minutes per year.
To achieve this level of network availability much time is spent on analyzing network design and supporting equipment. However, there are a number of factors that affect availability. These include the loss of electrical power resulting in routers and servers powering down with most phones also turning off, software availability and operational support process management to minimize human errors.
Power availability can be addressed through the installation of uninterruptable power supplies (UPS) and generators. Phones can be plugged into a UPS or have internal batteries. While there are multiple approved methods for estimating hardware reliability with the parts count method and mean time between failure analysis being the most common, there is no such accepted methodology for estimating software availability. Therefore, software availability is measured over time as it ages and becomes stable. As a software package accumulates runtime, bugs and code defects can be corrected improving availability. It is also important that the software has a quick restart time or the fine nines target becomes increasingly elusive. Software maturity and restart time are critical factors in software availability.
Finally, there is human error. Having good processes for maintenance, change management, provisioning, network monitoring, fault detection and repair can minimize human error. In addition to stringent and well-developed processes, human error is also addressed through higher levels of expertise and experience.
A network can be designed to meet the 99.999% availability but due to all of the factors that can affect availability, actual service delivery is usually estimated between 99.9% and 99.99%. However, defining service delivery in terms of minutes of downtime per year misses the most important point. Service interruptions cost money, time and potentially lives and users want the service to be available when they need it. Therefore, a service provider must view availability as an absolute requirement and make every effort to meet the standard of 99.999%. Through continuous improvement and diligence such a goal can be met.