The issue that initially seemed to be a problem with Microsoft’s Azure service was actually caused by a faulty update released by a single software provider named CrowdStrike. This update quickly spread across the world via Azure’s global networks, impacting 8.5 million Microsoft machines. After the source of the issue was identified, the situation was resolved, although some businesses and the general public were affected.
Another outage occurred on July 30, 2024, affecting global businesses without warning. Unlike the CrowdStrike incident, this outage was not as severe in terms of cause, impact, or implications. It highlighted the reliance on cloud services that may not always be reliable, prompting a deeper look into the nature of the outages.
The media often uses different terms to describe these incidents – a loss of confidentiality is referred to as a breach, while a loss of integrity or availability is called an outage. Understanding the differences between these terms is crucial for learning lessons from each incident.
The recent outages point to a growing concern surrounding the reliance on cloud services, particularly for critical services like emergency response systems in the UK. The decision to use Microsoft’s cloud services for critical and public safety applications may have exposed the UK to unforeseen risks, as these services were not designed for high-risk use scenarios.
It is imperative for government officials to address these issues and hold responsible parties accountable for any breaches or outages that occur. Lessons must be learned from these incidents to ensure the safety and reliability of critical services for the public.