advertisement
Why There Was A Global IT Outage
A massive IT outage nearly brought the world to its knees.
It wrought chaos in airports, supermarkets, banks, healthcare services, media and a string of businesses globally. Dubbed “one of the largest IT outages in history,” by the Financial Times, the epic disruption was laid squarely at the feet of Microsoft’s Windows system following a faulty software update. Attributed to US cybersecurity firm CrowdStrike, it manifested as the “blue screen of death,” impacting Windows 10 users.
Something super weird happening right now: just been called by several totally different media outlets in the last few minutes, all with Windows machines suddenly BSoD’ing (Blue Screen of Death). Anyone else seen this? Seems to be entering recovery mode: pic.twitter.com/DxdLyA9BLA
— Troy Hunt (@troyhunt) July 19, 2024
Microsoft injected the BSoD in Windows 3.0 back in 1990. A blue screen error, a blue screen, a fatal error or bug check, officially known as a stop error “is a critical error screen displayed by the Microsoft Windows and ReactOS operating systems. It indicates a system crash, in which the operating system reaches a critical condition where it can no longer operate safely,” states Wikipedia, adding that “Possible issues that can cause a BSoD include hardware failures, an issue with or without a device driver, or unexpected termination of a crucial process or thread.” When a serious error stops Windows from working, a blue screen appears, detailing what happened and driving users crazy as it typically shows up at the worst possible moment.
advertisement
In this instance, essential services from the tech giant, such as Microsoft Teams, Windows 365, and OneDrive were affected across India, Australia, the US and the UK, with the effects felt locally by some CIOs. Aussies were the first to raise the alarm. The Verge reported that “The issues spread fast as businesses based in Europe started their workday. UK broadcaster Sky News was unable to broadcast its morning news bulletins for hours this morning, and was showing a message apologizing for “the interruption to this broadcast.” Ryanair, one of Europe’s biggest airlines declared it a “third-party” IT issue.
According to Reuters, Crowdstrike’s “Falcon Sensor” software caused Microsoft Windows to crash and display a blue screen, adding that “The travel industry was among the hardest hit with airports around the world reporting delays and issues with their system network, while banks and financial institutions from Australia and India to South Africa warned clients about disruptions to their services.”
CrowdStrike is actively working with customers impacted by a defect found in a single content update for Windows hosts. Mac and Linux hosts are not impacted. This is not a security incident or cyberattack. The issue has been identified, isolated and a fix has been deployed. We…
— George Kurtz (@George_Kurtz) July 19, 2024
CrowdStrike eventually identified the issue and deployed a fix following an explanatory tweet by George Kutz, the CEO. The root cause, it appears, was an update to the kernel-level driver that CrowdStrike uses to secure Windows machines. While CrowdStrike identified the issue and reverted the faulty update after “widespread reports of BSODs on Windows hosts,” it doesn’t appear to help machines that have already been impacted. CrowdStrike, reports FT, is “one of the largest providers of “endpoint” security software, which protects connections between computer networks and remote devices — from laptops, phones and servers to retail payment terminals and cash machines — that are connected to corporate networks.”
advertisement
Meanwhile, BBC pointed out that Microsoft had suggested an incredibly familiar solution we are all no doubt familiar with asking, “Have you tried switching it off and on again?” It may have worked for some PCs. “Several reboots (as many as 15 have been reported) may be required, but overall feedback is that reboots are an effective troubleshooting step at this stage.”
It also said on X that it was “taking mitigation actions” after service issues. “Our services are still seeing continuous improvements while we continue to take mitigation actions,” adding that a number of its products had been restored while Microsoft Teams’ users may be unable to access group chats. Microsoft Purview, Microsoft 365 admin centre, Microsoft Fabric, and PowerBI are as of publishing this, still affected. Customers of Microsoft’s Azure cloud computing platform, much of which runs on Windows, have also reported problems. But if you’re still affected, here’s a little help.
“The underlying cause has been fixed, however, residual impact is continuing to affect some Microsoft 365 apps and services. We’re conducting additional mitigations to provide relief. More details can be found within the admin centre under MO821132 and on https://status.cloud.microsoft.
advertisement