<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Cascading-Failure on wid's blog</title><link>https://wid-blog.github.io/en/tags/cascading-failure/</link><description>Recent content in Cascading-Failure on wid's blog</description><generator>Hugo</generator><language>en</language><lastBuildDate>Thu, 13 Feb 2025 00:00:00 +0000</lastBuildDate><atom:link href="https://wid-blog.github.io/en/tags/cascading-failure/index.xml" rel="self" type="application/rss+xml"/><item><title>Ad System Outage Retrospective — A Shared Dependency and a Single Point of Failure</title><link>https://wid-blog.github.io/en/posts/career/dable/cascading-failure-retrospective/</link><pubDate>Thu, 13 Feb 2025 00:00:00 +0000</pubDate><guid>https://wid-blog.github.io/en/posts/career/dable/cascading-failure-retrospective/</guid><description>An external event drove ad traffic far above normal, triggering a cascading failure. The real problem was that the filtering component was a single point of failure — and the fallback sat on top of it too, so one collapse pulled both down at once. The fix took three paths: removing the fallback&amp;rsquo;s dependency (independence), adding rate limiting to the component itself (protection), and reconsidering the runtime (throughput).</description></item></channel></rss>