Microsoft Service Outage Raises New Cloud Reliability Concerns
The recent outage of Microsoft’s Azure Cloud Platform-as-a-Service (PaaS) has reopened a persistent question about the reliability of Cloud services.
Microsoft’s most recent service disruption is just the latest evidence that Cloud alternatives are not immune to problems. Last year’s Amazon Web Services debacle was far worse and lasted much longer.
Yet, overall the uptime record of all the major Cloud service providers is far greater than most enterprise data centers. And, the leading Cloud vendors are generally better at keeping their customers informed about the status of their services than inhouse IT departments have been.
But, no service provider can achieve 100% uptime. Therefore, it is important for IT and corporate decision-makers to fully understand the potential pitfalls and put in place a series of back-up and recovery capabilities to minimize the short-term and long-term impact on their business.
The vulnerabilities associated with relying on a third-party for an important business function are obvious. But, the remedy isn’t avoiding the opportunity to take advantage of a potentially valuable service entirely. Instead, it is to put in place the right contingency plans to anticipate and mitigate the risks.
This process begins with thoroughly understanding how the Cloud vendor’s services are architected to determine if there are any inherent shortcomings in their design. There should also be a full accounting of their service performance records. Evaluating their service level agreements is important, but examining their problem escalation and notification policies is even more critical because these practices will determine how the service provider responds to issues when they occur.
Service disruptions also perform a useful service because they make corporate decision-makers aware that they must put fail-over and other back-up and recovery systems and plans in place. Many organizations view these as needless and costly contingencies, like insurance policies that are never used. However, all it takes is one major outage to clearly demonstrate the greater costs of losing valuable data.
Mid-size businesses face a tough challenge because they generally lack the inhouse skills to assess these risks and develop an effective plan to address them. They should turn to service providers who can openly discuss these issues and learn about the pieces they’ve put in place to support their customers in the event of a problem. Alternatively, they should enlist the help of an independent firm to guide them through this process.
Either way, it is imperative that mid-market decision-makers not let their concerns about potential service disruptions dissuade them from capitalizing on the greater business benefits that can be derived from today’s Cloud services.
Disclosure: This post was written as part of the IBM for Midsize Business program, which provides midsize businesses with the tools, expertise and solutions they need to become engines of a smarter planet.