Tag Archive | "Availability"

How Complexity Spilled the Oil – Forrester I&O Blog

Tags: Availability, Business Alignment, Business Service Management, Forrester, IT Management, IT Management Tools


The Hub Commentary  __

A tweet pointed me to this post today and what a great post and analogy.  I, in fact, kick off most presentations by stating Business Service Management is EASY!  In fact, you hold the key to the most valuable insurance policy in your company.  Business runs on technology, it is commodity, like electricity, we count on it being there to conduct business.  I have a previous post on just that insurance policy in hurricane season on the east coast of the United States where the call center becomes the hub of activity for the power companies.  Customers phone in outages, crews are dispatched  and power is restored more quickly with better monitoring of the technology supporting the call center and dispatching crews.  Technology cannot stop an impending natural disaster, like a hurricane, it contains the effects of the natural disaster as described in the linked to post.

As with the oil spill that my friend JP references, early warning can aid to avoid an event or contain the event as was the case with the the power outage in the North Eastern US a few years ago.  This CIO once told me it took only 8 seconds for that outage to cascade from Ohio to the east coast.   Avoiding it at that stage was not possible, containing it becomes the goal.  After the event as JP describes, they implemented a monitoring system that correlated data from their grid monitoring system with their technology management tools for that complete picture to avoid events by reading the early warning signals and better contain events when they do occur.  An article is posted here describing the integrated approach this electricity operator took, just as JP describes.

I work with companies every day to justify the insurance policy we know of as Business Service Management.  In tough economic times when spending is reduced, justifying a spend becomes difficult when it is not reducing costs directly.  The cost of the approach and tools is far smaller (even when maintained over time) than the disaster of an outage or spill.

Can your company afford a game of high stakes poker when it depends on technology to operate?

Michele

_____________________

The Gulf oil spill of April 2010 was an unprecedented disaster. The National Oil Spill Commission’s report summary shows that this could have been prevented with the use of better technology.  (Read Full Article…)

What is Business Service Management, Really!

Tags: Availability, BSM, Business Alignment, Business Service Management, IT Management, IT Management Tools


A true story, names not revealed to protect the innocent and a Dilbert in the making.  An illustration of Business Service Management, rather than a Wiki like definition, of technology impact and calculating costs and value.

Early in my career, green and wet behind the ears, about 8 months into the job working the 4:00 – 12:00 shift solo (the shift where stuff gets done, but not discovered until 7:00 am the next day) in a distributed data center.  You know what I’m talking about and likely already sense the pain that is about to come.  I knew how to run the jobs, I didn’t know what they were really doing or how to fix things if they went wrong – at least not until one fateful summer night.  I was working for an outsourcer processing insurance claims for the customer to pay the beneficiaries.

The Set-up:

I worked my shift, I left at midnight, jobs done, reports printed, tape back-ups done, the girl working midnight was about to have an easy night.  That is until about 6:30 am when she would attempt to bring 3 mainframes online for the next days claims processing.  Yes, she was greeted with more error codes than she knew what to do with, The Boss received an early alarm/wake-up call, I can’t bring Rodney to life (that’s the nick name of each of our mainframes) HELP!!

The Solution – Scavenger Hunt:

I arrive at work at 4 to chaos, looks of anger and irrecoverable damage on your shift yesterday.  I look around, machines are humming and I say it was not irrecoverable, Rodney is up and running.  The phone rings, I answer it, my friend Richard in a distant location, he asks how are you, I say worst day of my life, he says, “it was you!”, meaning he had helped restore service, but no one ratted me out as the root cause.

Richard walks me through my previous night’s shift and what I did and didn’t notice.  I trashed a bunch of  files.  Not a big deal if you have back-ups, which we did, just hang a tape, reload the files and restart Rodney – 5 minutes.

The Cost – my Penalty:

The Boss comes into the data center and waves at me, come take a walk with me.  I figure I’m about to get fired, afterall, the data center was down for 7 hours, not a single claim processed, beneficiaries didn’t receive checks, my company missed an SLA, dozens of people worked 7 hours to fix my mistake, but there was something even worse I was about to experience.  At 20 something, I couldn’t calculate the number of zero’s for the cost of my simple error.

The Boss walks me through a room with the folks that input claims and reminds me they get paid by the claim, meaning the number of claims they key each day.  My simple mistake caused a 7 hour outage, a team of people to find the root cause in order to restore service, my company may have been slapped with a fine, beneficiary checks were delayed, but most heart wrenching to me was that I impacted the paychecks of more than 100 folks that were paid by the number of claims they keyed each day.  As I walked through the room, they didn’t know I was the root cause, but they were glaring at us none-the-less.

The room seemed the length of a football field that day.  As we exited the room The Boss simply said, “are you going to do this again?” and I quickly responded, “I hope you fire me if I do!”.

Business Service Management – claims processing was my business, my company caused an outage of significant cost.  This happens every day, the cost is quite easy to calculate and the insurance policy to mitigate the risk is far less costly, however, as IT professionals we have a difficult time justifying service enabling our data centers with proper management until there is an outage.  A single outage can cost 1-2% of revenue and a solution to avoid it can be a fraction of that cost.

Data centers are growing more complex, virutalization and cloud computing are seen as low cost options by removing hardware and software costs, however, the cost of support is overlooked and we are entering a familiar cycle of short sided savings over long term cost to repeat the dotcom bust of the 90’s with the hosting providers and web services.  Service Enabling infrastracture with an end-to-end view to pinpoint root cause, visibility to read the indicators before impact so that restoration can be minutes – not hours greatly reducing the cost of an outage has to factored into the solution.  By service enabling with management upfront allows you to take risks, be agile with new technology by having the right management in place to monitor for thresholds, errors, etc. avoiding and mitigating outages.

I know my Boss wasn’t really mad that I was the root cause of an outage, he’s was mad that a 5 minute fix relied upon a 7 hour scavenger hunt!  This is my Dilbert – what’s yours?

Michele

Response Time Testing is not enough

Tags: Availability, BSM, Business Service Management, IT Management, Performance, Response Time, Service Level, Service Management, SLM


Setting up a tool that performs some type of end user performance testing is not enough, it is a type of testing that provides a view of the end user experience of using a part of a specific service.   Adding Service Level Management on top of the testing is still not enough.

Business Service Management is a bit more encompassing, when there are slow response times, which piece of the supporting technology is the culprit, is this something that can (or needs to) be addressed now?   If we were to take this slow database offline in order to address the issue, what impact would that have on the enterprise or end users.  Business Service Management helps with these problems and more.  End user response time measuring is just a piece of BSM, it might be a good starting point, but don’t be fooled, you are not done.

Remember there are several layers in the OSI model and having a health indicator from each of those layers (or several at least) is going to provide a better picture end to end of the health of the service.  The big management tool vendors typically compete against each other, the typical model is rip and replace, they sell you new tools and get you to stop using the old tools… very expensive and disruptive proposition.  Since there is no single vendor that is the best of breed for each of the OSI layers, then a single vendor for the end to end management doesn’t make sense.   It makes more sense to purchase some of the tools to do specific types of monitoring, leverage opensource to monitor some of the other aspects and then roll all of these together into a single end to end view.   This approach allows you the ability to swap out tools when they become dated or when the vendor is trying to hold you hostage at renewal time.

Having a single console that is able to integrate with all of the underlying technologies managing the environment and providing an end to end view is a better way to manage the enterprise, using a response time tool and crossing your fingers that everything will work out is risky.

Tobin

Where Network and Systems Management is Headed Next – NetworkWorld

Tags: Availability, Business Service Management, Change, IT Management, NetworkWorld, Open Source, Performance, Predictions, Service Providers


The Hub Commentary     __

I tend to agree with the increased focus on performance and end-user monitoring and believe it will be driven by the requirement to monitor the service providers as cloud services are incorporated.  Controlled change management with an end-to-end view of the complex infrastructure will work to mitigate risk and both will rely on an integration platform and strategy.

Open Source monitoring for the commodity will also rise in popularity and implementation, but I don’t agree with building more into the monitoring tools or waiting for one vendor to build a single framework.  A single vendor framework is and has never been built ground up to monitor and manage the data center.  The big four all grow through acquisition of many technologies and cobble them together through data layer integrations.  No offense to my vendor friends, I’ve been a product manager for one of them and did exactly the same thing after an acquisition was made – it’s the quickest way to claim integration victory.

I believe we will see data centers leveraging Open Source, Cloud services for both Services (commodity management – ala ITSM tools) and Infrastructure (to test out excess capacity and demand flexibility options); rise in both availability and performance monitoring for new technology and delivery methods; and end-to-end visibility requirements to mitigate risk and speed restoration time.  All of this is solved and future proofs the data center by considering a  sound integration platform and strategy pulling together the physical, virtual, cloud environment into a single view for monitoring, managing and measuring.

Push your suppliers to build the best monitoring/management tools possible and leverage an integration platform to bring the best of your investments together to transform your data center into a service provider.  Oh yeah, 2011 will be the tipping for data center transformation into a service provider – not technology manager.

Michele

______________________

Depending on where they stand in the overall environment, network and systems management companies hear different concerns from their enterprise IT clientele. Here’s a look at how the year will shake out in a number of different areas … (read full article…)

IT in 2011: Four Trends that Will Change Priorities – CIO

Tags: Availability, Business Alignment, Business Service Management, CIO, Cloud, Performance, Trends


What does the post-recession IT world look like? More media will drive the need for more bandwidth, and a demand for Windows 7 upgrades and corporate use of personal smartphones will shape new priorities for IT.  (read more…)

The Hub Commentary ___

Think like a start-up sums up what I was thinking as I read this article.  Good long and short side views in the IT news these days.  This is a great thing, there is activity again and feels like budgets are loosening for the right spirit.

Here’s what I mean by short and long sided.  Cost savings by reducing infrastructure – To the Cloud!  Hidden costs to monitor, manage, support, secure and protect.  Rarely is it cheaper to outsource unless you are a hideously inefficient organization.  However, right source is the right approach.  Another example are all the new technologies mentioned in the article.  I’m sure there is an expectation for performance, availability, information accessibility and many platforms and by the way we are starting with mobile now.  Again, that pesky back-end monitoring, managing, supporting,  securing, protecting and measuring.

The consumer market drives business requirements and thus IT.  The introduction of every new technology to the consumer market should immediately be thought of as entering the enterprise and thus evaluated for it’s application and potential value-add or not.  Business is still ahead and IT is still out of synch reacting.

2011 is going to be a tipping point of a year for alignment of IT to become a service enabling organization with agility.  The IT manager that begins thinking like a start-up to meet the requirements, embracing new technologies and building management in from the get go will be the winner in the long run.

Is your IT an Operating Commodity   or   Contributing Necessity?

Michele

7 Things You Need to Build a Cloud Infrastructure – PCWorld

Tags: Availability, Best Practices, Business Service Management, Cloud, IT Management Tools, ITSM, PCWorld, Service Level


Today, service providers and enterprises interested in implementing clouds face the challenge of integrating complex software and hardware components from multiple vendors. The resulting system can end up being expensive to build and hard to operate, minimizing the original motives and benefits of moving to cloud computing.  (read more…)

BSM Stories from the Trenches-Hurricanes, Availability & Power On!

Tags: Availability, Business Service Management, Monitoring


Tale of Customer Service, Cost of Service Impact, Speed to Restore and the “Charley” View!

As we are in the heart of Hurricane season, I’m reminded of the old “Charley” Business Service View – a Category 4 Hurricane in 2004. This is a true story about a power company and how IT is impacted and how IT, by being proactive and hurricane prepared, can be the business driver in containing and managing the impending events that a hurricane brings with the loss of power due to downed lines. This is the second in a series of Business Service Management (BSM) Stories from the Trenches that I’ll post describing the benefits of a single-pane-of-glass and the management of complex infrastructures as services to the business and the customer driving customer satisfaction, revenue and growth.

The Set-up . . . . .

Just 22 hours after Tropical Storm Bonnie hit, Charley struck as a Category 4 Hurricane making it the first time in history that 2 tropical cyclones struck the same state in a 24 hour period. While the power companies have a bit of time to prepare for storms, managing the 2004 season was a tough one. So what is relevant to the power companies and what does IT have to do with the delivery of electricity:

* >3 million power consumers in the region
* Customer service applications become key
* Responding to customer complaints and logging them
* Using the customer calls to identify all outages
* “Keeping the Lights On” in the customer response center becomes key
* Dispatch systems are operationally key
* Returning service to customers in a time of need relies on IT doing more than just “Keeping the Lights On”

The Solution . . . . .

We need to know what to work on first in a “sea of red”and we know that due to power issues we will be inundated with network and systems management events. Managing those key customer and operational systems will take on a higher priority than anything else. This requires 4 things:

* Identify key customer and operational services in adverse situations
* Filtering of the intelligent service model to focus on newly prioritized services
* “Live”, single-pane-of-glass prioritizing events automatically
* Proactive service view to manage key systems averting downtime

We have monitoring in place, but we do not have a way to pull it together and marry it in a meaningful way to the infrastructure and we need to create a Service View of the infrastructure. This will require a lot of integration to meet the “live” requirement so that we can take action in real time and avert service impacting events, to filter the view to changing conditions and prioritize events.

This is where Novell’s Business Service Management came in to integrate, build the intelligent service models, automate the filtering of services, prioritization of events and the “Charley” dashboard that drove the delivery of high quality service of key customer and operational services.

The Benefits . . . . .

* >500,000 customers lost power
* >6,000 crews were dispatched successfully
* <1 week, 98% power restoration
* Exceeded the committed goal of 10 days for power restoration

IT does not restore the power and IT is impacted by the loss of power, but IT is critical in delivering the services that enable those to restore the power by connecting customers and line crews. The Intelligent Service Model and the automation of filtering services and prioritizing events ensures that systems are available that aid in power restoration. This is not an uncommon story for customers of the Novell Business Service Management solution. The heart of solution is the “live” integration and the intelligent service model making sense and relating bits of disparate data as super objects with rules describing conditions and state enabling operational teams to service align, mitigate risk and deliver mission critical services with consistent high quality driving the business.

I often run into IT folks that do not believe IT is critical to the organization. I find that is because they have not invested in understanding and aligning to the business objectives. Electricity is commodity, however, IT is key to driving revenue in getting power flowing again and the restoration of service in a time of need and adversity. Ask yourself what your company delivers to the public and how IT impacts driving the services that support that revenue and you have your answer as this power company did in determining where to focus and how that focus changes due to conditions. Are you agile enough to monitor, manage and measure to changes in real-time?

When Mission Critical Services is All About — “Keeping the Lights On”!

Check out this article of yet another energy company, again containing & mitigating risk: An Integrated Utility Network

Accidental Cloud Ldr–Stealth Cloud Followers–Which Cloud are you On?

Tags: Availability, Business Service Management, Cloud, Performance, Service Level


Are you leading your organizations cloud roll-out or are you reacting to it? It is happening, better to lead than follow!

The WorkloadIQ post and the article Richard references on the Stealth Cloud from a CIO article reminds me of a previous artilce about the Accidental Cloud Leader from a Networkworld article. Both of these articles point to the cloud is coming, the choice facing IT organizations is whether to lead, control costs, mitigate risk, deliver quality service and manage costs or to follow with rising costs, reactive IT, high risk and poor service quality. Richard hits the nail on the head, IT is traditionally change averse and insecure with the concept of outsourcing services. Technology is evolving faster and faster and the very organization that should adopt, deploy and lead with technology continues to lag.

In almost all cases when it comes to sourcing decisions they are done to create change that an organization has difficulty bringing to the organization, not for cost reasons. Commodity functions are best suited for outsourcing, driving standards and managing costs. However, outsourcing the service does not remove accountability for managing service delivery.

Cloud providers are popping up faster than service providers duiring the dotcom boom days of web hosting, application hosting, etc. There are several key factors to consider as pointed out in these articles and blog posts:

* Availability of service
* Risk of a secure service
* Reliability of the service provider
* Cost of support

Availability of Service and Reliability of the service provider

The dotcom bust of service providers in the early 2000 era came down to lack of mature management processes. Many providers today are one significant outage away from being out of business. Is this who you trust your services to? Who’s managing and leading this due diligence in contracting for the services in the leader / follower scenario?

When seeking service providers, it is important to understand their management processes and capabilities. You do not want to define them, but the lack of management transparency and process indicates maturity of the service provider and their ability to delivery availbility of services. One thing to note here is not to ask for inappropriate service levels and/or penalties. Investigate their typical services, leverage the cloud and service providers for the commodity and take advantage of the economies of scale they offer.

Risk of a secure service

Security as an obstacle in going to the cloud or leveraging an as-a-service provider is, quite frankly, IT noise. As described in these articles and blogs, this is the service providers business and they know it is their number one objection. In many cases, they may offer a far more secure environment than most IT organizations and thus the rise of IT insecurity and noise. However, again, it is an area that must be investigated as it relates to the mature management practices of a service provider.

Cost of support

Organizations are expressing frustration with their IT organizations as a perceived obstacle to agility and innovation when they go to the cloud directly. As Richard’s blog points out, this costs your IT organization more in the long run to support, the service will go down, the business will call support for help, the provider most likely may not be reliable and in the worst case, data and security can be breached.

Management generally lags new technology and this cycle to go to the service providers directly for a defined service and defined cost is more appealing to the business. Management lags both with IT internally and with the service providers compounding the risk of an outage or security breach.

Novell Operations Center (a WorkloadIQ solution) provides the ability to monitor, manage and measure technology services both internally as well as the performance and availability of the service provider insuring quality service delivery. Service enabling your infrastructure could not be easier today and would provide the control with agility your organization is screaming for from your IT organization. Management does not have to be an afterthought and the right platform can future-proof your services with technology adoption agility!

Check out these articles and then answer: Are you following or leading your organizations cloud rollout because it is happening and coming . . . Are you Stealth or Leading? What are your challenges and concerns?

Change Mgmt Tools Can Prevent Application Outages – NetworkWorld Podcast

Tags: Availability, Business Service Management, Change, NetworkWorld, Podcast


Gartner says 80% of downtime is due to human error and problems created by process, such as inadequate testing and unauthorized changes. And research from Managed Objects, a provider of business service management tools, shows many of these failures come from custom or home-grown applications. Network World’s Jason Meserve talks about the issue of application outages and what can be done to prevent them with Michele Hudnall (12:34)  (Listen to this Podcast…)

An Integrated Utility Network – Electric Energy Online.com

Tags: Availability, Business Service Management, Electric Energy Online.com, Performance, Service Level


The business case for business service management (BSM) at Ontario’s Independent Electricity System Operator, commonly referred to as IESO, started out as a proposal for solving a traditional IT management problem. Yet in the process of defining the problem and evaluating solutions, the IESO discovered a way to simultaneously enlist the endorsement of business users by incorporating the supervision, control and management of the power grid and its energy market systems into the IT project.  (Read Full Article…)