|
Service Level Management: The Big Picture by Rick Sturm Service Level Management (SLM) -- to most people in the industry, the term refers to a very narrow, finite space -- Service Level Agreements, metrics, data capture, reporting, and so on. The products that IT departments and service providers purchase for SLM are predominantly in the area of performance management and reporting, plus tools to measure availability. While those are not the only tools, they are the dominant ones - the ones that certainly dominate the thoughts of most service providers. That reflects their thinking about SLM. However, SLM is much broader than that. It encompasses everything involved in delivering a service at acceptable levels. When the subject is addressed in this way, most of us will be quick to agree. Yet even then, most will tend to narrowly focus our thinking, concentrating primarily on the issue of performance management. Infrastructure Behind the Scenes It is true that performance management is an important part of SLM. It is the part that ensures that expected levels of service are provided, or even exceeded. Performance management is the piece of the puzzle that provides for trouble-shooting performance problems and also for continuous improvement of service levels. Similarly, fault management is a key component. Through fault management, problems are detected and addressed. In some cases, those problems may impact availability, while others will degrade performance. However, both impact the level of service being provided. Unfortunately, these components are also reactive in nature. (Yes, I do realize that some products monitor performance in real time and make extrapolations, issuing warnings when there is a chance of violating a service level guarantee. However, I will argue that even that function has a sense of being reactive.) Fault and performance tend to grab the headlines in our mind. They represent the issues that we are concerned with on a daily basis. "What else is there?" you may ask. The answer is simply, "everything" in terms of infrastructure. If you are going to provide any kind of service, you must first "provision" it. That is, you create the system of hardware and software that is necessary for you to be able to provide the service. If the equipment and software selected is not appropriate for the service, there is no hope of being able to deliver adequate levels of service. There must be facilities to house the equipment. The service must be configured to support the customer or internal user. If the configuration changes are not made, then the user is not going to be served. Don't Wait for a Disaster However, the list of areas frequently neglected does not stop there. In the wake of the tragedy of September 11, it is clear that backup and recovery are more important than ever before. The idea of having thorough and tested disaster recovery plans, including an alternate site available in the event of a disaster, has long been viewed by executives as expensive luxuries. They would look at the probability of an actual disaster occurring and contrast that with the cost. Far too often, they would decide that the danger represented an acceptable degree of risk. Even without the complete obliteration of a building, every day, thousands of smaller "disasters" occur. Some merely require restoring a corrupted database. In other cases, there may be actual damage to facilities or equipment. The notorious fiber-seeking backhoe may have just severed the only connectivity between your facility and your users. What are you going to do? Someone on your staff with a sick sense of humor may decide to create an anthrax scare by sending a "contaminated" letter to a co-worker. The police are called and your building is evacuated for two days of testing. I once worked in a high rise building in the downtown area of a major city. That building had to be evacuated when it was discovered that a small natural gas leak from a pipe under the street had filled a sub-basement with an explosive level of natural gas. Tornados, floods, earthquakes, power blackouts the list of possible causes of disruption of service is seemingly endless. If you aren't well prepared to deal with the consequences of these disruptions, you are not prepared to deliver the service. Security in an Insecure World Another neglected area of SLM is security. The most likely single cause of a disruption or degradation in service is the well-intentioned, but inept employee. You say that you have only the finest, error-free employees? Fine. What about the employees who are not well intentioned? Those people who, in the face of layoffs or out of pure malevolence, deliberately set out to damage your facility or disrupt the service? It happens every day and usually goes unreported. Then there the hordes of "crackers" who try to bring your service to its knees. It won't happen to you? Don't bet on it! The choice of targets often seems random. Being small in size or profile does not provide assurance of safety. Denial of Service attacks happen all too frequently and like other security issues usually goes unreported. Then, of course, there are the viruses. These are becoming increasingly sophisticated and destructive. Not only can viruses disrupt your service; they can also destroy your firm's relationship with users by causing sensitive information to be released. Like backup and recovery, if you aren't deadly serious about security, then you are not prepared to deliver the service you are guaranteeing. If you're a user contemplating a contract with a service provider (even an in-house service provider, such as IT), in the course of your due diligence, you must address how adequate are their security measures and disaster recovery plans. Of course, if you are like most users, you won't bother with due diligence and will instead rely on the representations made by the sales rep for the service provider. (If this is the case, I want to be your service provider, and while we're at it, I have some swampland in Florida that I'd like to sell you. I am assured that, although it is a swamp, that it is totally free of snakes, mosquitoes and alligators.) "And the Winner is " In my last column, I asked readers to send in their stories of SLM in non-IT environments, with the promise that one of them would receive an autographed copy of "Foundations of Service Level Management." The winner is Anthony Dowdall. He is a Contract Manager for Interleasing in Birmingham, England. He has implemented an SLM process for interface between insurance companies and people making repairs to insured equipment. He did this by creating a set of manual processes to capture the data and ultimately to log it into a database. Rick Sturm is the founder and president of Enterprise Management Associates, the first technology analyst firm to specialize exclusively in management software and services. |
||
| Copyright (c) 2000-2003, nextslm.org. All Rights Reserved. Legal Statement. | ||