I recently wrote an article about Service Level Agreements (SLA) and why I think they are so hard to manage and track. Now what if your organization has its SLA implementation under control? Your organization already provides reports to the management team and the customer showing accurate availability metrics for the business services your team delivers.
While working on SLA projects, I have found that we are intently focused on the end result…the reports. The report needs to be accurate, show historical data, and perform reasonably. After all, this is what the management team will see. However, by the time you are showing the report to management or a customer, it’s too late. The SLA violation has probably already occurred, and your company will pay the price.
SLA implementations need to warn you that a breach will occur. In any SLA report, you can look at the metrics and determine that a business service was not available, but what was the root cause and how can it be fixed? Better yet, how can future outages be prevented? We know that management and customers will view the SLA reports, but the operations team also needs access to this data. This is the team that is responsible for the IT resources, so they need a way to view availability metrics. They need the ability to drill down into the business service and view what IT resources have failed. They need a detailed list of outages and a way to view the root cause of these outages. With this information, the team can research the cause of the issue and take action to prevent further outages from occurring.
Our goal is to meet a customer’s service level agreement. A good SLA implementation will not only provide reports to the customer, but also give your organization the tools it needs to meet those SLAs.