Skip to main content

“Customers don’t FEEL the average, they FEEL the variation”. 99.99% Uptime Still Leaves the .01% Chance of a Complete Disaster

Often there is a mismatch between your cloud technology partners Service Level Agreements (aka SLAs) and your requirements as an organization.  I know of vendors that set SLAs (and any associated credits you would be eligible for) based on their ability to deliver the service, in its entirety, as opposed to YOUR ability to perform critical functions after you’ve got past the login screen.  

Most SLAs include something related to “system uptime”.  Vendors will often post their month-over-month performance showing 99.99X% uptime.  First off, be aware of what “uptime” means.  In some cases  it just means your ability to get to the URL/page on which you would login to their solution.

The big miss though, is situations in which you can access the platform and login, but a super critical function, sometimes the primary function of the application itself, is simply not available or is so slow that it’s rendered unusable.  What happens in those situations?  This is where that quote from the title of this post come into play.  The average in this case is that average month-over-month uptime we mentioned earlier.  Average is relatively easy to articulate for a cloud software vendor .  However, the variation is what can make or break your organization for an entire day (or longer in some cases), resulting in increased costs and poor perception in the public arena.  

The challenge you’ll run into is how to quantify/qualify such system impacts.  While capturing root cause and subsequent business impact may be extremely challenging, be sure to have this discussion and negotiate appropriately to ensure your bases are covered as much as reasonably can be.  Don’t be afraid to dismiss the 99.99X% metrics, while they are important (obviously) they are table-stakes (or, the bare minimum) in today's cloud technology environment.  Vendors that don’t hit such performance metrics will have a hard time retaining customers.  Instead, see what you can negotiate into your contracts that is more in line with the loss or erosion of critical sub processes that you rely on.  “Server uptime” alone hides the fact that while you may be able to login, you may not be able to perform key functions that are essential to your organization.  Identify those big-ticket items up front and have a meaningful conversation about them.  In the worst case, find levers that you can negotiate with that mutually recognizes the risk, and identifies other opportunities for value creation that you can tap into (i.e. better response times, a dedicated technical resource, free services, lower uplift at renewal etc. etc.).  If you don’t tackle these nuances early on, don’t be surprised when, despite all that language about credits in the Master Subscription Agreement (aka MSA) you find you’re not eligible for any form of compensation when a critical function goes down.



Comments