Blame storm in the panic room

It’s like a war scene. Crisis in hand, paranoid people running around like headless chicken, non-stop ringing of phones, a flood of high-priority emails from every branch of the organizational tree, the blame game of who-did-what being played with full enthusiasm. The  IT department of an organization lives up to its hard-earned title, the ‘panic room’.  All this thanks to an unknown reason that had made the enterprise network go for a toss.

Being in the Pitstop could be like tightrope walking. It could be full of uncertainties and predictions. Being thrown rude surprises and crisis at midnight are regular incidents in the life of an IT administrator. What at all is the way out? Do they have to live all their lives finger pointing? Can’t there be a better way of handling crisis situations? IT teams to an organization mean as much as  SWAT teams mean to a nation. When there is chaos, they  should promptly act to restore normalcy. Instead, if they play the blame game to perfection, whom does it help at all?

John Boyd’s OODA model Observe->Orient->Decide->Act – for strategic military operations quite applies well in this scenario which is almost war-like.  Here is what it means in this context:

1. Observe

When things get out of control, to look out for what is happening is very critical  for cues that can help solve the crisis. In the case of an IT administrator, it is the logs, reports and the different monitoring systems that help to observe and identify the problem at a deep level to narrow down the root cause.  (Check out the different tools from ManageEngine for monitoring network, bandwidth, logs, firewalls, server, applications, configuration, devices and more)

2.  Orient

Once the root cause is identified, there are a plethora of options to solve it. Orienting is the process of making a list of all these options that help in averting the crisis from magnifying in intensity and impact.

 3. Decide

Of the different courses of action available, choosing the one that seems optimum is the ‘Decide’ step. We recommend that you choose a solution that has the maximum positive impact in minimum time.

 4. Act

Implementing the  solution decided helps  to handle the issue in a progressive and solution-oriented manner. This could be in the form of increasing the bandwidth for an application, blocking access to  some services for a limited time period or even redirecting traffic from one server to another.

A systematic and meticulous approach , unfortunately unavailable, is crucial for IT teams to handle crises and has to be an effective part the infrastructural setup of an organization. Equipping the teams with adequate ammunition such as IT management tools helps them work towards restoring normalcy. These tools play an important part in the first step of OODA and without them, the blame game will continue.

 References & Further reading:

  1. http://blog.serverfault.com/2012/07/18/ooda-for-sysadmins/
  2. http://en.wikipedia.org/wiki/OODA_loop

You Can Learn More About the ManageEngine Product Line By Going to www.ManageEngine.ca

The original article/video can be found at Blame storm in the panic room

About the Author: Shannon Lewis

Leave a Reply Cancel reply