We on The Hub do not usually post tool specific posts, but today I’m going to point you to a great, best practices post by Tobin. Yes, he does mention a product, but the post is more about setting up High Availability systems and what that means.
Often times when we speak with customers about Service Enabling their infrastructure and end-to-end visibility of services, we enter into a discussion of “nice to have” versus “must have”. So why is management “must have” today and high availability relevant with management tools. I always answer, “Do you want the pilot flying the plane blind without instruments? Why would you run your business blind?”
IDC forecasts by 2013 >50% of IT budgets will be dedicated to Outsourced IT. While most IT organizations are providing high availability of services, it’s not based upon priority. 90% of IT organizations are still a level 2 of maturity, reacting to events versus managing proactively. Thus this tells us all technology is being managed equally when the business is shouting for management by priorities to the business and thus becoming more involved in buying decisions and deployment options.
Long story short, management and visibility is a must have to getting to proactive and removing the obstacles of embracing new technologies and services that drive the business. This requires service enabling your infrastructure for visibility and then it must be treated as mission critical and requires high availability infrastructure to continue to deliver high quality services.
Do you treat your management systems as high availability systems? Why not?
Configuring Operations Center (and other products) within a High Availability (HA) configuration tends to confuse people. I guess it starts with the basic requirement of needing an application/service configured in a manner that it has some level of Fault Tolerance (FT) and/or HA which in turn reduces the possibility of outages (system not being available for the end users). FT helps with HA, but FT is not HA. (more on that later)
Fault Tolerance (FT) is more about configuring the hardware of a system in a manner that in the event of a failure (IE: hard drive), without intervention, the system automatically recovers and continues to operate. The most common is around hard drives and leveraging “RAID” (Redudant Array of Independent Disks). Other solutions provide dual power supplies, NIC’s, etc. Read More …