David Heath
Thursday, 03 September 2009 15:17
Secondly, Telstra has incredibly poor change management procedures. Were this author in charge of the team responsible, there would have been extensive testing plans and fully formulated roll-back procedures. It would appear that the team in charge of the change had spent the night committing the update then all gone home to a well-earned sleep, not bothering to advise the operational centre how to roll-back should things go wrong – how else can we explain the hour taken to fix the issue? Waking someone up to dial-in and repair the damage?
Thirdly, service-level guarantees are generally expressed in uptime percentages. For instance, if you were guaranteed 99% uptime, you could expect 3.65 days of outage per year (hopefully scattered across the entire year). More common are guarantees of 99.99% uptime. This equates to around 52 minutes and 34 seconds of downtime per year. Anyone with such a guarantee provided by Telstra is now able to invoke non-performance clauses based on this incident alone.
As the spokesperson noted in the quote shown earlier, "A full investigation is being undertaken. We have commenced a detailed and thorough technical investigation into the incident. This may take some time to conduct to ensure we fully understand the issue and can put appropriate measures in place to maintain the integrity and operation of our network."
Allow this writer to be totally astonished that such measures were not already in place.
Be assured, HEADS WILL ROLL.
Think again. Most businesses only have PART of a DR plan - and this spells business disaster in the event of an IT disaster.
Download The Seven Sins of Disaster Recovery White Paper now and find out how you can prevent this happening to you.