The recent Microsoft/ Crowdstrike global outage, triggered by a seemingly innocuous cybersecurity update, serves as a reminder of the critical, and often underestimated, role Third-Party Risk Management (TPRM) plays in ensuring business continuity.
On the evening of July 18th UTC, a “defect” in a content update from CrowdStrike, a leading provider of cybersecurity for businesses worldwide, triggered a cascading effect, causing one of the most widespread IT outages ever seen. This update, intended for their Falcon virus scanner on Windows machines, triggered a Blue Screen of Death (BSOD) on affected systems, forcing them offline and into a reboot loop. This IT outage disrupted critical operations across various industries, including financial services, healthcare, and transportation.
The CrowdStrike outage triggered a global domino effect, crippling critical services like emergency response systems, flights, banking, and even medical access. This incident, coupled with Microsoft’s Office 365 issues and a summer travel surge, caused significant productivity losses for businesses, highlighting the far-reaching consequences of an unforeseen incident. While they fixed the update, recovery for affected machines requires manual intervention, potentially leaving security gaps and causing delays.
The Challenges
Cybersecurity companies like CrowdStrike operate in a fast-paced environment. New threats emerge constantly, demanding rapid updates to security solutions. However, this relentless pressure can lead to a trade-off between speed and thoroughness. Here’s where potential problems arise:
- Limited Change Management – A robust change management process ensures updates are rigorously tested before deployment. In the rush to address new threats, such processes might be bypassed, leading to unforeseen issues.
- Widespread Rollouts – Rolling out updates to all users simultaneously magnifies the impact of potential problems. A phased rollout, targeting specific regions or user groups first, can minimize disruption in case of unforeseen bugs.
The lessons
The CrowdStrike and Windows outage incident exposes a critical vulnerability in our hyper-connected world: a lack of robust Third-Party Risk Management and business continuity plan to protect against disruptions. Here are some important lessons we can learn from the global outage:
Moving Beyond Third-Party Risk
This incident highlights the need to go beyond simply evaluating a vendor’s product or service. A thorough TPRM approach should delve into a vendor’s security posture, their track record with past incidents, and their commitment to ongoing risk management.
Organizations need to extend their risk management beyond immediate third-party vendors and consider potential vulnerabilities within their entire digital supply chain, encompassing not only “fourth-party” networks (vendors to their vendors) but also potential weaknesses in the infrastructure and security practices of other interconnected businesses and partners.
Understanding Your Contracts
Many third-party vendor contracts contain limitations on liability (indemnity clauses, insurance, warranties). These clauses can significantly impact how much an organization can recover in the event of a disruption caused by the vendor’s negligence. Organizations need to be aware of these limitations and negotiate stronger terms where necessary.
This may involve seeking legal counsel to ensure contract language is clear, concise, and adequately protects the organization’s interests. Additionally, organizations should develop a standardized approach to third-party risk management contracts, ensuring all essential clauses are included and potential loopholes are addressed.
Proactive Monitoring
Continuous monitoring of third-party systems is crucial for identifying vulnerabilities before they can be exploited. This can be significantly enhanced by leveraging continuous monitoring TPRM solutions. These solutions act as an extension of your security team, providing real-time insights into the security posture of your third-party ecosystem. By automating vulnerability scanning and integrating with threat intelligence feeds, TPRM solutions can identify potential risks much faster than traditional methods.
Additionally, these platforms often offer streamlined communication modules, facilitating the exchange of critical information with vendors regarding security incidents and patch deployments. This holistic approach to continuous monitoring empowers organizations to proactively manage third-party risk and ensure the resilience of their digital supply chain.
Preparedness for Disruptions
Even with strong TPRM, unforeseen events can still occur. A well-defined business continuity plan (BCP) ensures the organization can respond effectively, minimizing downtime and impact on operations. This includes procedures for recovery, alternative solutions to maintain critical functions, and a communication plan to keep stakeholders informed.
Businesses often dismiss major disruptions like the CrowdStrike outage as “too unlikely” to plan for. But recent events show even complex, multi-layered outages are possible. Instead of focusing on “if” an event will occur, shift your perspective to “when” and develop contingency plans to minimize impact. Business continuity plans are not static documents. Regular testing and refinement are crucial to ensure they remain effective in the face of evolving threats.
The CrowdStrike outage may have caused disruption, but it also presents an opportunity for a paradigm shift. Companies must now move beyond reactive measures and embrace a proactive approach to TPRM. This requires a multi-pronged strategy: in-depth vendor assessments, continuous monitoring of third-party systems, and a clear understanding of fourth-party dependencies.
Effective communication and collaboration across IT, procurement, and risk management teams is essential. Additionally, a strong business continuity plan ensures critical operations can continue even during disruptions. The CrowdStrike outage presents an opportunity for positive change. By prioritizing proactive TPRM, fostering collaboration, and building resilience, companies can emerge stronger and more prepared for the future.
Transform Risk Management from Reactive to Proactive. Don’t settle for reactive measures. Enlighta can help you transform your approach to TPRM and build a proactive program that identifies and addresses risks before they disrupt your business.
Get in touch with us at info@enlighta.com