Preparing For The Next AMZN Cloud Outage

Amazon ($AMZN) made headlines over the weekend with the regional failure of AWS.  We won’t go deep into the details of what happened or who was affected since that has been covered by many other outlets.  In general, a incident caused loss of service in a particular region.  The service providers were not able to deliver to their customers during the outage.  While it is sexy to call it a cloud failure, the same end result could have occurred with any single site implementation.  Hosting in your own data center, using the co-lo facility downtown, or an unfortunate GoDaddy location could cause your net presence to disappear.

Business leaders should evaluate what needs to be improved or changed in terms of resiliency.  Decisions will need to be made based upon the size of your business and what your concerns are.  A nano cap company (sub $50M market cap) will most likely have different requirements than a Large Cap global enterprise.  Rather than invent the wheel, you can make use of frameworks to organize your activities.  There are many out there, but today we will focus on ISO 27001 and ISO 22301.

Business Continuity is a component of ISO 27001, while ISO 22301 attempts to address Business Continuity as a whole.  Section 4.2.1(d) of ISO 27001 requires that you identify the assets of the in scope portion of the business and the business owners of these assets, the threats to the assets, vulnerabilities that might be exploited by the threats, and the impacts to confidentiality integrity and availability.

Conducting a risk assessment in its most rudimentary form is a good exercise for any business of any size.  The information that you put together as part of the risk assessment can be useful in other areas as well, such as obtaining the right insurance coverage at the right price.  Fire or flood could impact your data center or it could impact manufacturing and logistics.  Knowing this up front, you can take action to mitigate those risks or accept those risks.

Not everything needs to be corrected or addressed in some way, but having a running checklist of issues can be a good road map.  A pizza restaurant with an online shopping cart may not care if the cloud provider of their online order application goes down.  There’s always telephone, fax, and walk-in that will keep the business running.   Cash flow, CapEx, OpEx, and other business drivers will influence the need for availability.  Not every business will need multiple data centers if they are self-hosting or multiple availability zones in the cloud.