There are countless sources for data center downtime, but none are as problematic as uninterruptible power supply (UPS) failures. The irony is that the UPS – as indicated in its name – is supposed to ensure uptime if the main power source goes offline.
The leading cause of downtime
Given that a brief windstorm is enough for the electric grid to sustain damage, it's easy to imagine why these fixtures are essential to business continuity, and moreover, why they're frequently put to use. However, it doesn't explain why most of the recent high-profile cases of downtime have resulted from backup power failures. Here are just a few types of organizations that suffered downtime after the UPS failed to pick up the slack:
- Airlines: In 2016, two major U.S. airlines have experienced business downtime that resulted from power outages, which were augmented by UPS power failures in the data centers hosting critical systems. In both cases, passengers experienced grounded flights, cancelations and delays.
- Co-location facilities: In September 2016, one of the U.K.'s leading data center services providers experienced an outage that was later found to have stemmed from a cable connecting to the UPS. This event occurred a little less than a year after a different colocation facility dealt with a UPS-related failure.
- Cloud data centers: Even the most prolific cloud vendors have experienced IT downtime in the past few years as a result of a UPS failure, causing downtime for clients.
Remote power monitoring and strong transfer switches can help
The examples listed above are only a few of many instances in which UPS failures in the data center have led to obstructed business operations. Without diving too deeply into the cause of each, it's clear that UPS management has room for improvement.
A good place to start is to ensure that the UPS isn't treated like a secondary component. Technically speaking, it is a secondary source of power – and according to FacilitiesNet contributor John Yoon, UPS infrastructure is often kept out of sight toward the rear of the facility. Nevertheless, tapping into backup power isn't as simple as firing it up once the primary source goes offline. For instance, the UPS storage room must be treated like any other part of a data center, which means that optimal climate conditions (temperature, humidity, etc.) need to be maintained.
Once a power failure does occur, your facility's transfer switches will need to shift to the secondary power source in 25 milliseconds or less to ensure that no disruption of services can take place. During the time that backup power source is operational, operators need a way to monitor the UPS and supporting power management infrastructure in real time. For this reason, remote power monitoring is essential for detecting the earliest signs of a UPS failure.
Finally, once the primary source of power is functional, an automatic transfer switch will be able to immediately shift the electrical loads off the UPS and back to the original source – crisis averted.