I’m not a big fan of Service Level Agreements (SLAs) as a provider or a consumer of them. But, since they are here to stay, here are three problems and ways to improve them.
Problem 1: First of all, many operational SLAs are based on averages. (e.g. average speed of answer, mean time between failure, mean time to close, mean time to resolve (MTTR), etc.) In a normal distribution pattern (where the mean and median are equal), 50% of performance is better than the average (green in diagram below) and 50% is worse (orange). The way companies traditionally measure performance, SLAs are going to surprise customers almost always – half the time they’ll be good surprises, and half the time they’ll be bad surprises.
Recommendation: What customers really want is confidence that your solution will contain “no surprises.” If you must craft operational SLAs, focus on consistent performance. In this model, you align resources and processes to drive confidence in service delivery. For example, rather than defining a SLA such that the MTTR of an incident is less than 4 hours, consider reframing the metric so that 90% of incidents are resolved within 4 hours. In this way, only 10% of the incidents would be bad surprises. SLAs of this type encourage consistently good performance.
In Problem 1, companies focus on reducing the average to improve. In my recommendation, the average will take a back seat to improving consistency.
Problem 2: To reduce their exposure to financial penalties, and because averages are easier to deliver than confidence factors, providers’ contracts focus on meeting average performance.. Service providers make a financial calculation about those penalties. They craft SLAs to be (a) difficult to claim damages from and (b) less costly than delivering excellent service. In my experience, customers choose providers because one provider has a better SLA than another. But the “better” SLA comes with few choices to rectify and limited financial compensation for missing the SLA.
Recommendation: Read the small print and “game” a few scenarios to see if the penalties are valuable enough to overcome the inconvenience and business impact of missing the SLA. (See this example of a common way SLAs can be (are being) misused.)
Problem 3: SLAs solely based on operational (quantitative) metrics often miss what customers want. Customer needs vary all the time. Sometimes a service outage classified as critical that is solved in 4 hours is sufficient but sometimes, even a 2 hour resolution is too long. The point is, SLA measurements focused on MTTR alone may not achieve what customers need at a specific point in time. (See below on CSAT.)
Recommendation: Instead of looking for auto-escalation rules in your provider’s work processes, ask for flexibility to “escalate” an issue whenever you aren’t getting what you need. That way, providers modify their service actions to your needs.
Now for the sad truth… SLAs are here to stay. I’m not a big fan and while they may drive the right behavior, more often companies hide behind SLAs as defensive shields instead of delivering great service.
So… my suggestion is to ensure that any/every time you negotiate SLAs you add a requirement to capture qualitative customer satisfaction (CSAT) metrics. Add an SLA that says something like “95% of all interactions will be rated by us as above average.” This small change will ensure your provider is contractually obligated to balance quantitative measures with qualitative measures. (Note: The push back you’ll get when you ask for this change is that CSAT is subjective while the other SLAs are objective. True, but so what?)
With this background, I hope you’ve got some new ways to think about SLAs and will add CSAT to the long list of ways used to you gauge provider performance. The pivot point is that providers who welcome your subjective feedback (e.g. in the form of CSAT measurements) are more likely to be invested in your business success – not just their “metric” success.