Tuesday, 18 September 2018 10:24

Microsoft releases some details about 4 Sept Azure issues

Microsoft releases some details about 4 Sept Azure issues Pixabay

Microsoft has released an overview of the problems suffered by its Azure cloud platform beginning on 4 September, saying that the investigation would continue and a detailed analysis would be released later.

The company said in on its Azure status page that high-energy storms had hit southern Texas on the morning of 4 September, close to Microsoft Azure’s South Central US region, resulting in voltage fluctuations both up and down.

These changes affected the data centre's cooling systems which shut down, causing damage to hardware and necessitating their replacement. A decision was made to attempt data recovery and not to fail over to another data centre which caused a cascading impact to services outside the region.

In the South Central US, storage servers began to shut down from about 2.30am Pacific Time on 4 September (7.30pm AEST 4 September). A huge number of services were affected and though the vast majority of the effects were mitigated by 4am Pacific Time (9pm AEST 4 September), full mitigation did not take effect until 1.40am Pacific Time on 7 September (6.40pm AEST September 7).

Other services affected were the Azure Service Manager, Azure Active Directory, Visual Studio Team Services, Azure Application Insights, the Azure status page and Azure subscription management.

Microsoft offered an apology to those affected and said it would be investigating the following, which are deemed to be the biggest contributory factors to the incident:

  • "A detailed forensic analysis of the impacted data centre hardware and systems, in addition to a thorough review of the data centre recovery procedures.
  • "A review with every internal service to identify dependencies on the Azure Service Manager API. We are exploring migration options to move these services from ASM to the newer ARM architecture.
  • "An evaluation of the future hardware design of storage scale units to increase resiliency to environmental factors. In addition, for scenarios in which impact is unavoidable, we are determining software changes to automate and accelerate recovery."

The Azure service was hit by a fire suppression incident last year. It also suffered prolonged downtime in 2013.

WEBINAR event: IT Alerting Best Practices 27 MAY 2PM AEST

LogicMonitor, the cloud-based IT infrastructure monitoring and intelligence platform, is hosting an online event at 2PM on May 27th aimed at educating IT administrators, managers and leaders about IT and network alerts.

This free webinar will share best practices for setting network alerts, negating alert fatigue, optimising an alerting strategy and proactive monitoring.

The event will start at 2pm AEST. Topics will include:

- Setting alert routing and thresholds

- Avoiding alert and email overload

- Learning from missed alerts

- Managing downtime effectively

The webinar will run for approximately one hour. Recordings will be made available to anyone who registers but cannot make the live event.



Security requirements such as confidentiality, integrity and authentication have become mandatory in most industries.

Data encryption methods previously used only by military and intelligence services have become common practice in all data transfer networks across all platforms, in all industries where information is sensitive and vital (financial and government institutions, critical infrastructure, data centres, and service providers).

Get the full details on Layer-1 encryption solutions straight from PacketLight’s optical networks experts.

This white paper titled, “When 1% of the Light Equals 100% of the Information” is a must read for anyone within the fiber optics, cybersecurity or related industry sectors.

To access click Download here.


Sam Varghese

website statistics

Sam Varghese has been writing for iTWire since 2006, a year after the site came into existence. For nearly a decade thereafter, he wrote mostly about free and open source software, based on his own use of this genre of software. Since May 2016, he has been writing across many areas of technology. He has been a journalist for nearly 40 years in India (Indian Express and Deccan Herald), the UAE (Khaleej Times) and Australia (Daily Commercial News (now defunct) and The Age). His personal blog is titled Irregular Expression.



Recent Comments