Tag Archive | "Availability"

Operations Center – Clustering & High Availability – Qmunity

Tags: Availability, Business Service Management, IT Management, IT Management Tools, Performance, Service Level


Business Service Management Commentary on IT Service Management, Service Level Management & Performance Management

IT Management - High Availability

We on The Hub do not usually post tool specific posts, but today I’m going to point you to a great, best practices post by Tobin.  Yes, he does mention a product, but the post is more about setting up High Availability systems and what that means.

Often times when we speak with customers about Service Enabling their infrastructure and end-to-end visibility of services, we enter into a discussion of “nice to have” versus “must have”.  So why is management “must have” today and high availability relevant with management tools.  I always answer, “Do you want the pilot flying the plane blind without instruments?  Why would you run your business blind?”

IDC forecasts by 2013 >50% of IT budgets will be dedicated to Outsourced IT.  While most IT organizations are providing high availability of services, it’s not based upon priority.  90% of IT organizations are still a level 2 of maturity, reacting to events versus managing proactively.  Thus this tells us all technology is being managed equally when the business is shouting for management by priorities to the business and thus becoming more involved in buying decisions and deployment options.

Long story short, management and visibility is a must have to getting to proactive and removing the obstacles of embracing new technologies and services that drive the business.  This requires service enabling your infrastructure for visibility and then it must be treated as mission critical and requires high availability infrastructure to continue to deliver high quality services.

Do you treat your management systems as high availability systems?  Why not?

Michele 

 ______________________________

Configuring Operations Center (and other products) within a High Availability (HA) configuration tends to confuse people. I guess it starts with the basic requirement of needing an application/service configured in a manner that it has some level of Fault Tolerance (FT) and/or HA which in turn reduces the possibility of outages (system not being available for the end users). FT helps with HA, but FT is not HA. (more on that later)

Fault Tolerance (FT) is more about configuring the hardware of a system in a manner that in the event of a failure (IE: hard drive), without intervention, the system automatically recovers and continues to operate. The most common is around hard drives and leveraging “RAID” (Redudant Array of Independent Disks). Other solutions provide dual power supplies, NIC’s, etc.  Read More …

Where IT Dollars are Headed in 2012 – CIOInsight

Tags: Availability, BSM, Business Service Management, CIO, CIOInsight, Cloud, IT Management, IT Management Tools, Mobile, Performance, Service Providers, Transformation, Trends


Mobility and wireless network infrastructures are the big takers when it comes to IT budget planning for 2012, our latest study reveals. Even so, organizations are moving to the next stage of the IT infrastructure build-out across multiple budget areas, and our 2012 IT Investment Patterns Study shows how the strategy trends of innovation, integration and reversion are having a significant impact on 2012 spending patterns.  Read More Here . . .

_________________________________________

In this survey results for spending in IT Operations/Management/Governance are the only area with an increase is Data Center Management. Those that are following the lead of the service providers will see this as the management of the technology to deliver that consistent and stable IT performance for business value from my previous post. Mobile delivery options prevail as a leading technology as the consumerization / BYOD (bring your own device) of IT continues. However, these solutions must perform and be available to drive your organizations competitive advantage in the market.  This is the link to business for 2012 investments requiring the stitching together of data from the many systems and applications that are in place today and turning it into real-time actionable information.

Does your IT just run the business or does it drive the business?

Michele

Measuring Cloud Services with a Handshake or Storm Cloud

Tags: Availability, Business Service Management, Cloud, InfoWorld, IT Management Tools, Monitoring, Performance, Service Level, Transformation


A friend of mine, Richard Whitehead, recently posted two blogs (Two Lawyers and Shakin Up) on the topic of service level agreements, contracts and lawyers for cloud based services.  My favorite quote in these posts, “Send lawyers, guns and money”.   All I can say is, if it comes to lawyers, guns and money, it just ain’t worth it.  Far too much time is spent on the negotiation and perceived service missteps than is put into the quality of service and driving revenue.

This is a topic near to my heart as I embark this week to draft my own presentation on the topic, Cloud Service Contract Get Stormy, for the upcoming Data Center World Conference in Orlando later this year.  As an analyst, I would review many outsourced service level agreements against industry best practices, reasonableness and guidance regarding how to manage services.  There are a few common pieces of advice I suggest:

  • Service Accountability – you as the IT organization maintain ultimate accountability for the service to the business and your customers.
  • Operational Processes – you as IT no longer own “how” the service is delivered only that it is delivered in a manner that is acceptable.
  • Operational Tools – you as IT no longer own the management tools and technology that monitors the delivery of service.
  • Service Levels – accept the standard service levels, drive toward economies of scale and standardization.
  • Penalties – protect for gross negligence and harm to the business, not perfection for the sake of perfection.
  • End of Contract – data, who owns it, how is it transitioned – be sure the transition is covered to avoid hidden costs.

The first thing I suggest that organizations review are their services – not all services are created equal.  Some drive revenue, others drive out costs from the organization.  I’ve covered this categorization in another post Finding Your Services.  Classify your services and source the commodity services that are not unique to your organization.  The more unique, the more mission critical, the less appropriate it is to outsource unless you are early to market and do not have the talent in-house and need to buy it.  There is risk and reward to buying talent to deliver a market changing service and there is some forgiveness in the market for hiccups in this scenario.   Source with clear objectives and manage as such.

The second thing I usually end up discussing is “how” the service is delivered and “managed”.  Do you tell your electricity provider how to deliver electricity to you home or business?  Do you tell them the proper management practices and tools to use to deliver electricity?  And yet having adequate power is a piece of delivering IT services.  I believe what makes the sourcing of IT services different is that we as IT organizations have some expertise in delivering the commodity services and while we want to insure quality service, we need to step back and define the service and performance objectives and manage to those shifting our role from service deliverer to service manager.  It is up to the service provider to deliver the service and meet the agreed upon objectives.  The role of IT is shifting in these mixed environments to a service manager and communicator of service to the business as it drives revenue.  This is illustrated as the hottest growing job in IT is a Business Architect according to a recent article in InfoWorld translating technology as service performance to the business.

The next thing that comes up is the viability of the service provider.  This takes us slightly back to the previous paragraph to insure the provider has processes, practices and tools in place to reasonable manage the service without mandating how the service is delivered.  There are other points of reference here as well regarding their financial viability, duration in business, other customers, references not just for the good service, but references when service failed and how the provider responded.

Service levels and penalties is another topic of discussion and where visions of guns and lawyers dance around.  Go back to step 1 and remind yourself of the value of the service, don’t demand unreasonable service levels as they will come at a premium price and thus don’t impose high penalties also raising the risk of the service to the provider and thus cost to you.  Understand reasonable levels of performance, availability, responsiveness and security.  The more custom and imposing your SLAs and penalties, the higher the cost of the service and thus the higher the cost to manage the service.

Finally, do not overlook the end of contract transition.  Insure there are no surprises or hidden costs in transitioning systems, data, etc.

The final, final discussion is monitoring the performance of the service.  Budget 7% of the contract value to manage the vendor and service.  This includes the monitoring of the service to avoid those perception versus reality discussion of the health of the service.  It is still ITs responsibility to manage the service and know how it is performing to take appropriate proactive action in the event there is a hiccup whether that be to deploy additional resources, re-direct resources, etc. for services delivered in-house, by providers or in the growing mixed environment.

In the end, the choice comes down to “pay now or pay later”.  I find it money better spent to monitor, manage and nurture the service provider relationship in the delivery of quality service over paying a lot more to bring out the guns and lawyers at the end.  The first drives revenue growth, the second illustrates failure and lost revenue to lynch a scapegoat.

I too listened to the customer speak that Richard references in his second blog post that does business with a handshake and works toward driving revenue.  Lawyers and Government are not the answer to regulating cloud service providers – drive your own destiny and revenue.  Ok….back to my presentation, come to Data Center World and hear more!

Michele

Business Service Management and Taxes!

Tags: Availability, Business Service Management, IT Management Tools


It’s everyone’s favorite time of year in the US, tax season.  It has a been a few years since Intuit experienced a major outage on filing day, but many of us loyal Turbo Tax, eFilers remember it fondly.  The original article appeared in ComputerWorld and while the headline hypes the dollars forked over by Intuit, I recall it was more of a rounding error in the larger Intuit revenue total.

It does serve as a reminder what an early warning systems can provide with proactive signals calling attention to degrading components and what service they are tied to and what the impact of the outage will mean.  In this case, there was a monetary loss, however, I suspect the bigger threat was brand and loyalty.

Intuit suffered a greater loss that I blogged on last year taking down their software-as-a-service Quick Books for several days.  This one must have felt more like a Dilbert moment where they had a back-up data center, in the same data center that experienced the power outage.  Doh!  This one I cannot say that an early warning system would have helped and this one was a bit more costly and again hit brand and loyalty hard.

I use these as examples that there are revenue costs to outages (1-2% of revenue annually and some single outages are 1-2% of revenue themselves), but there is also a difficult to quantify loss in customer loyalty and faith in the company.

I recall I had just filed mine when the outage started.  I was late that year, but an inch sooner than the outage.  I am still a loyal Turbo Tax eFiler, however, I don’t wait until the last minute any longer, but a sound business service management strategy and supporting tools are the dashboard to early warnings – just like your car.

What do you use for your early warning system?

Michele

So You’re Tracking SLAs…Now What?

Tags: Availability, BSM, Business Service Management, Cost Reduction, IT Management Tools, Service Level


I recently wrote an article about Service Level Agreements (SLA) and why I think they are so hard to manage and track.   Now what if your organization has its SLA implementation under control?   Your organization already provides reports to the management team and the customer showing accurate availability metrics for the business services your team delivers.

While working on SLA projects, I have found that we are intently focused on the end result…the reports.  The report needs to be accurate, show historical data, and perform reasonably.  After all, this is what the management team will see.  However, by the time you are showing the report to management or a customer, it’s too late.  The SLA violation has probably already occurred, and your company will pay the price.

SLA implementations need to warn you that a breach will occur.   In any SLA report, you can look at the metrics and determine that a business service was not available, but what was the root cause and how can it be fixed?   Better yet, how can future outages be prevented?   We know that management and customers will view the SLA reports, but the operations team also needs access to this data.   This is the team that is responsible for the IT resources, so they need a way to view availability metrics.  They need the ability to drill down into the business service and view what IT resources have failed.   They need a detailed list of outages and a way to view the root cause of these outages.   With this information, the team can research the cause of the issue and take action to prevent further outages from occurring.

Our goal is to meet a customer’s service level agreement.   A good SLA implementation will not only provide reports to the customer, but also give your organization the tools it needs to meet those SLAs.

 

Reaching Service Level Nirvana . . .

Tags: Availability, Business Alignment, Business Service Management, IT Management Tools, Service Level, Service Value


Ok, so we aren’t there yet.  The first part of getting over a problem is admitting that you have one.  How can we resolve the issues I brought up in my previous post?  Let’s talk about that now.
1.  Too many tools…
You are never going to reduce the number of tools you have down to 1.  Someone will always need this tool or that functionality.  So, to resolve this you need a tool that can pull data from multiple sources through integration.  Databases, APIs, web interfaces, traps, etc.  These tools do exist.
2.  SLA monitoring via trouble tickets
As I mentioned in my previous post, there is a lot of potential for human error here.  I would suggest to you that trouble tickets back up or provide the background reasons why the service level agreement (SLA) was violated but they should never be used to be those SLAs.  You also need your SLA to potentially have different thresholds for different parts/pieces.  Once you have integrated to the sources of information in item 1, then you should be able to build out your SLAs based on the business service taking into account the different parts of the service and areas where you have redundancy versus single points of failure.  Then being able to roll all of that up to a dashboards where you can see the results.
3.  SLA status based on Network availability
Total network availability should never be part of an SLA!  Your SLA should only include the parts/routes of the network that your service depends on.  The network availability is important, but not as important as the service availability.  Ultimately the SLA is there to insure that the customer can use the service.  If the service functions then the SLA is good, from the customers eyes.  You need to build a model for the service so that you can take into account all of the parts of the service both physical and logical and include a synthetic transaction to confirm that the service is functioning.  One last point here, if the service is available and it takes 5 minutes to log in, the customer sees this  as the service is down.  A well defined SLA looks at all SLA components from the customers point of view.
4.  Can’t get the data
This can be a hard nut to crack.  If you have the ability to get the data but because of political reasons you can’t get the data, then you have to involve the customer or customer advocate.  Ask things like:  How important is it to you?  Point out the holes and the areas you will be blind to.  What happens if this part fails and we don’t know it?  Ultimately this is either a big deal or it isn’t.  If it isn’t, fine.  If it is a big deal then you can leverage the pain that the customer conveyed to you to get at that forbidden data.  Use the customer as the club to get at the data if needed.  No one can argue (successfully) against providing good service to the customer.
5.  Technical vs business data
You have integrated your data from the different sources and built out a model of the service but the customer still complains?  Look at the service from a business point of view.  What tells me that the service is functioning?  Things like: transactions processed per (time period), web hits, database rows update, etc.  Now use this as data you need to integrate to.  Pull in this data along side your model to validate the technical with the business data.
6.  Data is too bad
Ok, valid point, but everyone starts somewhere and if you don’t start now, maybe your successor will do it.  To overcome this one, simply do everything as above only don’t show the results to anyone.  Instead use this data to improve the service, validate the model, confirm the SLA hours of availability, etc before the data is shown to the customer or management.  Use this time to improve your monitoring and functionality of your environment.
7.  SLAs just a punishment tool
Although I am sure you have seen this, it doesn’t have to be this way.  Instead of struggling to meet the SLA, change it, further define it, eliminate the false information.  Include the business information as mentioned above in item 5.  I have seen companies do this well and been willing to up the penalties they would pay during business hours, because they eliminated all of the non production information that they were paying for that had nothing to do with the SLA.  They also were able to exactly define the SLA hours, when 5 9’s were needed and when 5 7’s was fine.  This can give you some breathing room as well as allow you to more easily meet the defined SLA.  This can also allow you to setup different levels of SLA that then can enable you to charge more for those services that ‘must always be available.’
8.  SLA’s are only historical and I need real time
I hear it all of the time, I can’t worry about SLAs.  I am trying to deal with right now.’  A well defined SLA allows you to see the state of how things are right now AND they can give you predictive warnings as well.  Allowing you to be notified not just when there is an outage but also when (if nothing changes) you will violate your SLA in X hours or n minutes.  This can then take the service you provide to a whole other level.  Allowing you to see potentially customer impacting issues before they violate your SLA.  How can you afford NOT to set up your SLAs?
At the end of the day, well defined, monitored SLAs can improve how you are perceived by the customer and improve the service you provide as well.  Can we ever get to SLA nirvana?  Yes, I think we can.  It’s just a process that, when managed well and the correct information is gathered, really functions for you.
Lee Frazier

At the End of the Day . . . . .

Tags: Availability, Business Alignment, Business Service Management, Service Value


No one questions the need to select an operating system for the data center, the debate is Windows? Linux? Both? No one questions the need to implement an identity and security solution for the data center, that’s an easy sell. And virtualization of the data center is becoming an accepted “best practice”.

But many people still seem to question the value of Business Service Management for the data center. This has always seemed puzzling to me, because of one simple question that seems to get lost in the shuffle…”At the end of the day…what is the ultimate purpose of the data center?”

In my opinion, the answer has always been a very simple one:

The purpose of the data center is to support the mission critical, revenue producing, customer facing business services that you deliver to your customer.”

Business Service Management is about understanding in “real time” the availability and performance of those business services and being able to measure the level of the service being delivered. And if those services aren’t available or they aren’t performing, being able to quickly determine the root cause of the problem in order to minimize the impact to the end users of that business service.

Without those mission critical, revenue producing, customer facing business services, there would be no need for an OS or a security solution or virtualization…in fact, there would be no need for a data center at all.

So the real question in my mind isn’t “Why would someone implement a Business Service Management solution for their data center?” It’s ”Why isn’t Business Service Management being deployed in every data center on this planet?”

Ann Jones

Top Reasons We Have Not Reached SLA Nirvana – Yet

Tags: Availability, Business Service Management, IT Management Tools, Monitoring, Performance, Service Level


Why aren’t we at Service Level Agreement (SLA) nirvana?  I mean really, we have had SLA tools for 10, 15 years or more.  You probably have 1 or 10 or more tools that measure SLAs, of which most probably aren’t used.  Why aren’t all of our data centers, applications, servers and everything else just numbers on some dashboard that we just glance at to make sure everything is good to go and that we are open for business?  This troubled me so I decided to make a list of some of the possible reasons:

1.  Too many different tools, specialties and areas of focus

You have tools the measure SLAs for the network, different ones for the infrastructure, different ones for virtual machines, different ones for the cloud, and the list goes on and on.  I think this is one of the biggest issues with SLA reporting.  Who wants to look at 3 – 10 different tools to know if they are passing all of their SLAs?  Or who wants to maintain integration into all of those tools to then pull all of that data into one dashboard?  And then what do you do if someone wants to see historical data?  This becomes a very deep and very big hole. So then companies move on to my number 2 reason.

2.  SLA monitoring via trouble tickets

Wow, this is great.  Finally one source for all of our SLA data.  All we have to do is make sure every issue we have gets opened as an issue in our help desk tool.  Right!  The issue eventually happens that you missed an outage and that outage caused you to violate your SLA.  Then the logic pervades the company something like: ‘If our tool missed that SLA, what else is it missing?’  And eventually: ‘We just can’t trust this tool’ or ‘We just can’t trust our monitoring’ etc.  Also, this is dependant on someone putting in the correct data and time.  Not to say they would purposely fudge the numbers but how long would you say something was down that you were responsible for?

3.  SLA status based on Network availability

Ok, we have all been guilty of it.  If you have ever had to guarantee 5 9’s availability, you reported on just the network availability.  Why?  Because you had the data, your data met what was expected ( 5 9’s ) and you could easily report on it.  Did that meet the intention of the SLA?  No, but (insert your excuse here).  When someone that cares about an SLA defines it as 99.999% availability, they truly want to be able to access the application or business function 99.999% of the time not just the network.  This is discussed further in item 5.

4.  Can’t get the data.

Sometimes we just can’t get at the data that we would need in an automated fashion to allow us to have an SLA  defined.  This may be due to  political or technical issues, I am sure you have seen both.  This must be resolved with either the customer pushing for it or someone pushing for the customer.  In the IT world we live in today, virtually all data is accessible with permission and ingenuity.

5.  Technical vs business data

This one is also very common.  You report you are meeting your SLA of 99.999% up time and the customer says, ‘but it is never available when I need to use it.’  Been there?  Why is this?  Because you are reporting that all of the things that you are responsible for technically, are available.  But when the customer goes to use the application or business service, some piece that he uses and you might not be responsible for isn’t functioning or responding in a timely manner, etc.  Does this make your SLA data wrong?  Yes, from a customer perspective (and does anything else really matter?).  Your SLA must be looked at from the business point of view as much as possible.  Now, you won’t be able to take into account the customer’s home network being down and then having that blamed on you, but if you have enough data showing the service was available from a business point of view, you will be able to push back on them.

What do I mean about monitoring the SLA from a business point of view?  Well, it means a few things and these will change depending on how your customer uses the service.  Through put, response time, transactions processed per time period, synthetic transaction, functional status of all single points of failure for the service.

6.  Data is too bad

When you do get everything monitored and all of the data in one source, sometimes the data is just too bad.  Instead of 5 9’s, you’re showing 5 7’s.  So instead of showing this to the customer or management instead you (insert your excuse here).  This issue can be overcome by either going into the underlying tools and fixing the monitoring to only report outages when they are outages or by fixing your applications and infrastructure.

7.  SLA’s just a punishment tool

I have seen this in many different companies.  You struggle to meet the SLAs and whenever you miss, here comes the stick.  This will then motivate you to either fix the issues or quit reporting.  Too often I have seen the later.  This doesn’t have to be.  Used correctly SLAs can be a carrot and a stick. They can allow you to qualify exactly what is part of the SLA and what hours you are responsible to meet the SLA, thereby reducing/eliminating penalties for off hours and devices that aren’t part of the service or not in you control and then allow you to better meet the SLA for the true service times.  SLAs need to have the carrot to be managed effectively.

As we have remained in a reactive mode for many years, now is the time to turn that around into proactive and aligning with the objectives of the business.  In the next post we’ll talk about how you turn this around and stitch together a successful Service Level strategy.

What would you add to this list of challenges?

Lee Frazier

IT in 2011: We’re Managing Information-Not Just Technology-CIOInsight

Tags: Availability, Business Service Management, End-to-End View, Performance, Service Value


The Hub Commentary_

In the next 12 months the Corporate Executive Board predicts that one of the driving IT trends includes the end-to-end views of managing technology as business services.  The complexity of the environment is turning business service management practices into an imperative versus a nice to have.

The re-focusing of resources on growth and driving revenue is coming more and more to the forefront, while still balancing good cost conscience practices, but the time has come to move from only focusing on operating to driving growth.

This is a great article and the close is spot on, those that drive growth and put good business service management practices in place will lead their industry and the others will play catch-up.  It’s not about IT and business aligning, it’s about driving the business forward with technology.

How are you growing your business with technology?

Michele

___________________

Demand for increased business-partner control of IT is coming from opposite ends of the workplace spectrum: senior business executives and frontline end users.  IT in 2011: We’re Managing Information, Not Just Technology….. (Read Full Article…)

Virtual Business Service Management

Tags: Availability, Best Practices, Business Service Management, Performance, Service Value, Virtualization


I had a meeting with a customer the other day which was centered on virtualization and private cloud and a funny thing happened: it morphed into a Business Service Management (BSM) discussion.  The discussion got me trying to put my arms around “BSM in a virtual environment” and what it means.

In a traditional BSM scenario, you are managing your IT from a business perspective with the ability to drill into the business models until you can isolate and correlate the technology supporting the business service.  But in a virtual environment, you’ve abstracted another layer, right?  For example, my order processing is running slow.  Which is of more value to me?  The first layer down tells me that the virtual database server is experiencing a performance hit or knowing that the underlying network-attached storage is living in a network segment that is currently overloaded due to end of the month processing?

To be honest, I’m not sure there is a clear answer.  I think if I am the “break/fix” guy, I need to know at the lowest level so I can attack the problem in the infrastructure.  And of course, the business unit wants to know at the highest level that order processing is experiencing a slowdown.

What my conclusion was is that at the highest level, BSM hasn’t changed.  You are still mapping out your business processes to show the health of the process flow and the performance of the service.  But, to accurately or perhaps more appropo, usefully, map the technology to the business process, you need to have the virtual abstraction layer, which has the actual infrastructure mapped to it.  This could lead to some interesting analysis, such as response times across the virtual environment compared to response times in the physical environment and how they both correlate to performance of the business service.  Just some food for thought.

I’d be interested to hear about some real-life examples.  How are you monitoring the virtual?

Service Level Agreements: Why are they so hard to track? Just do the math!

Tags: Availability, Best Practices, BSM, Business Service Management, Service Level, Service Value


I have worked with many customers to track service level agreements in their BSM implementation. I can honestly say that there is only one thing that all of the projects had in common: they were extremely difficult.

Now, I was usually called in mid way through the implementation when the decisions had already been made and the schedule was looking impossible. Or even worse, I would become involved after the implementation had been put in Production and the mistakes were already made.

So why are SLAs so challenging to track and manage?

  • Have you seen the contracts? In general, I don’t like contracts. I’m not a lawyer, and let’s face it, they can be difficult to decipher. With SLAs, the first thing that needs to be done is take the contract and figure out what exactly was promised. Then determine what underlying data should be used for the calculations. Then figure out how to get that data from the IT devices and put it all together for the service. These steps are crucial to success, and must all be done before implementing the SLA solution.
  • It’s just (total time – downtime)/total time… Saying that a service needs to be available 99% of the time during peak hours is easy. Determining the actual availability key metric is more challenging. You need to determine what exactly constitutes an outage, set up calendars for peak hours, and determine any outages that shouldn’t count (should 1 second of downtime count?). The math for simple availability isn’t difficult, but accounting for all of the necessary factors…well, that is more complex.
  • So many numbers…so little time. Since computers have existed, engineers have worked tirelessly to optimize performance. There are limitations to what software can do. One must think about the amount of data to be stored and calculated. For instance, if the data for availability is being stored every minute, and the report shows the last two years of availability metrics, oh, and also real-time metrics, this report is going to take some time to calculate and display the results.

These are the main three challenges I see when working with SLA implementations. Now how do we solve these?

  1. Know the data before starting. This sounds like a simple task, and most people think they have a good understanding of all of the underlying devices, metrics, relationships that go into defining the service and the key metric for their SLA. No one would want to start implementing a SLA project without knowing all of the ins and outs. Or would they? People often start modeling their services and tying services to SLAs before all of the underlying infrastructure is in place. A thorough understanding of where this data will come from (monitoring software, trouble ticket systems, back-end databases) is critical because the calculation can change due to the type of data.
  2. Determine what details can alter the key metric. Like I mentioned earlier, calculating availability is not difficult. However, determining the total time and downtime can be. Take into account the time periods that determine maintenance. Is there a weekly maintenance period? What is “on time”? Also, what sort of data can be ignored? Are there certain outages that do not affect the service’s availability? Don’t be too generic…try to figure out all of the details that contribute to the SLA’s key metric.
  3. Be realistic when creating reports. The dashboards or reports are what we really care about. We need a way to show how the SLAs are tracking. We need a nice way to get a quick visual on what might have failed or what is on its way to failing. Putting 1000 services on a single page is probably not the way to go. Let’s also not reinvent the wheel. If your organization has been calculating SLA metrics for years in an external program, use that data. Why spend the extra time to set up the lower level data to feed into a program that is going to do the same calculation?

Tracking and managing Service Level Agreements will continue to take time and effort. It requires buy-in from many different departments and resources, but BSM should and can simplify an SLA implementation.

Euro CIOs Look to IT Consolidation: Survey – CBR

Tags: Availability, Business Service Management, CBR, Consolidation, Cost Reduction, IT Management, Performance


The Hub Commentary_

The continuation of centralizing, sharing, cost saving with the commodity IT.  These are short term savings that improve the bottom, however, do not improve the top line.  These are required and must always be on the agenda, balanced with growth initiatives.

The article mentions while providing higher performance and availability or quality of service.  These initiatives work to drive the top line in customer retention and new services for the business.  It is all a balancing act, but the key is not to lose sight of the long term growth for the short term save.

Michele

___________________

Cloud and virtualisation also on the agenda

CIOs across Europe have identified IT consolidation as a key near term initiative as they look for ways to maintain or improve performances despite the economic situation.  (Read Full Article…)

Outages Can Wreak Havoc on Productivity

Tags: Availability, Business Service Management, Gmail, Intuit, IT, Monitoring, Networking, Outages, Skype


In September, 2009 Gmail went down for two hours. To hear the complaining on social networks like Twitter at the time, you would have thought the entire world had come to a stand-still, but for many people it did. That’s because this service meant more to them than just a nice-to-have free service. People had actually come to depend on it to communicate for business and personal means. 

Other high profile outages have followed including the Intuit outage last June and the Skype outage in December. These two outages lasted more than a day, leaving many unhappy users in their wakes and providing a snapshot for you of what happens when your systems go down.

People who need these services to do their jobs are left looking for work-arounds that IT might not ultimately be happy with (like using unauthorized services to try and get something done).

The fact is that as you sit there looking at your monitoring dashboard, there are real people behind those red lights trying get their work done, and these stories illustrate in a very concrete fashion that when services go down–whether it’s a public service or a private one– it can have a profound impact on actual users.  It can be easy to forget that as you look at the data in front of you on monitors, but it’s important to keep in mind that it’s not just some abstract representation of the service levels inside your company.

In fact, for every red light you see on the dashboard, is another person unable to complete a task using that service and the more mission critical it is, the bigger the effect.

So as you monitor your systems, and review your data and watch the activity streaming through your equipment, always remember that there are humans who depend on these tools to do their jobs, and when a service goes down, even for a little while, it can have major ramifications.

Photo by nan palmero on Flickr. Used under Creative Commons License

Super Bowl, Victoria’s Secret and Business Service Management

Tags: Availability, Business Service Management, Cloud, CNN Money, IT Management, Service Value, USA Today


I can already hear you asking and scratching your heads, “Michele, what do those 3 things have in common – cmon, get real”.  Yes, I’m a long term IT nerd and tremendous football fan that remembers everything.  Heck, I fessed up to crashing a data center for 7 hours in an earlier post.  My motto:  go large or stay home or as we say in the south: ” if you can’t run with the big dawgs, stay off the porch”!

Back in 1999 Victoria’s Secret ran their first ever Super Bowl ad, $2.7M for 30 seconds.  Early days of online retail, wind back the clock and clear the cobwebs, yes, these were the early days of Amazon and the revolution to online sales and web hosting.  Then the unthinkable happened, 1M website visitors in an hour, was IT ready for this traffic?  Heck no!  Did IT and Marketing prepare and communicate this impending event, likely not.  Headlines:  CNNMoney 2/1/1999 “Victoria’s ‘Net Secret is Out” .  So can I fault them in these early adopter days, nah, I applaud them as market innovators.  However, they brought the data center to its knees.  Marketing created interest that the data center could not fulfill upon in an effort to generate Valentines sales with both men and women in an audience estimated at 100M so ~3 cent spend per customer and generated millions in orders.  Not a bad return.

Now roll the clock forward.  It was almost 10 years before Vickies (as I call them) purchased another Super Bowl ad in 2008 – USA Today 2/1/2008 “Victoria’s Secret back to Fiirt with Super Bowl”. No outage headlines this time, but this is a prime example of Business Service Management practices and IT understanding and operating with the business objectives in mind.

The moral of the story is IT and the business must link up in order to support major bursts in spend, marketing,  Super Bowl ads and traffic on the infrastructure to reap the biggest return on the investment.  This year’s ads are $3M for a 30 second slot.  Who’s “going to the cloud”?  Who leveraged the cloud for the additional one time capacity – now that is a story I’d love to read about linking Business Service Management and Cloud strategies.

Who’s going to make IT headlines after this year’s Super Bowl or will it be 2012 for the Cloud Bowl headlines?  How cool would it be to make headlines because you leveraged the Cloud for additional capacity to reap the greatest reward of a marketing spend at the Super Bowl? Business Service Management in Action!

As much as it hurts me to tell you this on so many levels, Texas Stadium is IT ready, Check out this video.

What I do find fascinating about this is that Texas is experiencing an unusual cold snap and they are experiencing power outages.  However, the news says the power outages will not impact Texas Statium, no guarantees to all the Texas TV watchers, but the event is supposed to be impact free. 100M viewers will determine that on Sunday!

P.S. – In case you  are wondering – I’m a serious Colts Fan, but this year the color of the helmet in the feature picture is no accident – Green Bay all the Way!

End user of Business Service Management

Tags: Availability, Best Practices, Business Service Management, Integration, IT Management Tools


As an end user within an organization, I require a dashboard that I can log into from time to time to see the Services that are offered to me and the health of the services that I currently am using.   There are pieces of this that fall into the Service Catalog arena, but in the end, these services need managed.

The Service Management console needs to be able to slice and dice the infrastructure into the components of the individual services being provided to the end customers.  It should provide a view based on the role the person plays within the organization.  As an end user, I should see the services I can sign up for and the services I am already signed up for.  As a manager, I should be able to see the services that my team is using and the availability of those services.

End Users do not and should not be required to know the servers, routers, NAS, etc supporting a particular service.  To them, it is EMail, CRM, Timesheet and a slew of other Service offerings.  The IT group needs to manage the services in the same way.   When users open tickets, it’s on the service, not the technologies supporting the service.  Business Service Management makes the focus of the management on the Service and the technologies supported them.

Tobin

Improved Business Resilience w/ Cloud Computing–Cloud Computing Jrnl

Tags: Availability, Business Service Management, Cloud Computing Journal, Downtime, IT Management, Performance, Service Level


The Hub Commentary_

The article references the cost of a single cost of downtime as approximately $100,000 and the risk of downtime increases as systems and infrastructures become more and more distributed and complex.  Now more than ever, services must be service enabled from a proactive monitoring perspective before it goes live into a production environment. Management cannot be an afterthought and also keep in mind, not all services are created equal.

Service enabling and creating adequate redundancy comes at a cost and has to be weighed against the value the service contributes to the business.  Managing the infrastructure as services is an imperative in 2011 to balance cost and value, while insuring service quality and availability.

Integrating the metrics from various technologies and make sense of them as an end-to-end service becomes critical in proactively managing services in real-time and taking action based upon leading indicators that illustrate risk of an outage is rising.  Mitigating risk and reducing downtime must be a factor of service enabling the infrastructure as it goes live in production.

As the article states, the cost of down time is high, catastrophic and the scavenger hunt that ensues to solve and restore service leads to lengthy downtime and is costly to your organization.  As technology professionals, leveraging new technology and deploying agile infrastructures is just a piece of the puzzle, management and service enabling the infrastructure is equally as important.

This is the year of investment in IT technology as well as it’s management infrastructure to service enable the infrastructure to insure it continues to execute in market time.

Michele

___________________

North American businesses are collectively losing $26.5 billion in revenue each year as a result of slow recovery from IT system downtime according to a recent study. The study also indicates that the average respondent suffers 10 hours of IT downtime a year and an additional 7.5 hours of compromised operation because of the time it takes to recover lost data.  (Read Full Article…)

BI Becoming Key Enabler for IT Performance Management–TRAC Research

Tags: Analytics, Availability, BSM, Business Service Management, Integration, Performance, TRAC Research


The Hub Commentary_

Tobin and I had the opportunity to speak with a new friend, Bojan Simic, yesterday of TRAC Research.  We shared thoughts on what is required to deliver Business Service Management (BSM) and help organizations communicate Service Performance thus Value to their organizations.

As Bojan writes in his last BSM post, there are many management tools, each has a strength and in all likelihood you have many in your environment.  In fact, we shared there are those with a half dozen, those with a dozen and those with >2 dozen.  Yes, I said 2 dozen and greater.  Each of these contributes a piece to the story, but what is really required is the integration platform that brings it all together in a single view representing Service Performance.  By Service Performance we mean, it’s availability, performance, volume of business transactions, etc.

The environment is becoming ever more complex and agile requiring the integration and automation that will bring all the data together that allows your IT organization to take full advantage of the best in breed monitoring tools.  With this end-to-end visibility in real-time you can then make sense of what you have, consolidate where necessary and potentially take advantage of lower cost open source options potentially.

The investment is in the integration and intelligent view of the infrastructure.  Where are you investing today?

Michele

___________________

Preliminary findings of TRAC’s end-user survey show that organizations are still struggling to gain full visibility into their IT services and infrastructure. Many of the organizations surveyed are reporting that, even though they made significant investments in new IT monitoring and management tools and increased the amount of performance data that they have on hand, they are still not seeing any significant improvements in key performance indicators (KPI).  (Read Full Article…)

EMA Radar for Business Service Management: Service Impact Q3 2010

Tags: Availability, BSM, Business Service Management, CMDB, CMS, EMA, IT Management, IT Management Tools, Performance, Service Level


Free Summary – EMA Radar for Business Service Management: Service Impact Q3 2010  – Enterprise Management Associates(Read Full Summary Report …)

Gartner’s Magic Quadrant Disses Amazon Cloud – NetworkWorld

Tags: Amazon EC2, Availability, Business Service Management, Cloud, Gartner, IT Management, NetworkWorld, Performance, Service Providers


The Hub Commentary_

“Visionaries have an innovative and disruptive approach to the market, but their services are new to the market and are unproven,” Gartner  Yes, this does describe Amazon and EC2, but does that mean it is 2 steps behind the Leaders?  Not often are there times to truly innovate and redefine a market.  Amazon and EC2 are redefining an industry regardless of internal enterprise IT or the consumer market.  Their customers are a mixture now over the traditional IaaS (Infrastructure-as-a-Service) IT service providers.

To the innovator go the initial spoils, revenue in the market.  Will or can they sustain that leadership will play out over time.  Given where we are in the life cycle of this disruptive technology (mostly development and test) in Enterprise IT shops, EC2 brings just the agility required for businesses to drive agile development of new products and services and move to market faster than ever before mimicking the consumer market.

In order to take these early development projects to production, yes, more formalized monitoring and management will no doubt need to be baked in.  Now the question becomes, who bakes that in and supplies that information?  Isn’t that agent part of the workload that is packaged and shipped out to the cloud to run on the subscribed to infrastructure?  I call this service enabling your workload and injecting intelligence into it for purposes of monitoring, securing and communicating the performance of the workload the subscribers responsibility.  Right?  The provider is responsible for the infrastructure your workload runs on, not what’s going on in the workload, that remains the responsibility of the subscriber.

I applaud Amazon and say keep challenging the status quo.  IT and the traditional proven providers need to think a little out of the box to meet the demands of market dynamics in market time! What do you think?

Michele

___________________

Gartner’s Magic Quadrant report has placed Amazon’s cloud computing service in one of its lower tiers, saying that for all of Amazon’s commercial success it is “visionary” but “unproven.”  (Read Full Article…)

Business Service Management and CMDB

Tags: Availability, Best Practices, BSM, Business Alignment, Business Service Management, CMDB, CMS, IT Management, IT Management Tools, ITIL, Service Level


So you have a console that has your Business Service Management views.   You set up the views to show the key Services you are providing to your end customer(s) (EMail, Databases,  CRM, etc).  You somehow are bringing in monitoring data in order to light up the service views in order to show some type of condition and health. You figured out how to measure the Service Levels and provide all of these details back to the end users and management in a dashboard.  The question is, how do you maintain it?

If you have been following ITIL, one approach is to integrate the BSM solution with the CMDB solution (assuming they are different solutions).   The CMDB probably has discovery populating it with new CI’s and updates to CI’s.  The CMDB should have inputs to other systems for additional details around the CI’s.   In the end, the CMDB is the location for the factiods around the Services such as all of the CI’s comprising the Service, relationships between the CI’s, current configuration of the CI’s and so on.   If those details are available, why wouldn’t you use it to drive the way in which IT is managing the environment.   As things change within the enterprise, the CMDB is updated and in turn the BSM views should auto-magically update also.

Tobin