エピソード

  • The Blast Radius of Agentic AI: Why "Five Nines" is a Relic
    2026/05/01
    This episode explores why the traditional "five nines" reliability metric is fundamentally unsuitable for agentic AI systems. It explains that unlike traditional systems, agentic AI can be "up" but still cause catastrophic failures through incorrect autonomous actions, leading to a significantly wider "blast radius" of damage. Listeners will learn about the unique failure modes of these self-directed systems and the critical need to shift focus from mere availability to ensuring correctness and integrity.
    続きを読む 一部表示
    11 分
  • Phantom in the Page Cache: Unpacking the 10-Line "Copy Fail" Exploit
    2026/05/01
    This episode discusses a 9-year-old, 10-line "Copy Fail" exploit found in the Linux kernel's page cache, highlighting the paradox of such a critical yet subtle vulnerability evading detection for so long. It explores the nature of this "phantom" bug, explaining how its "surgical precision" and exploitation of concurrency in the page cache make it incredibly difficult to detect, even in highly scrutinized software. Listeners will learn about the profound implications of small flaws in critical system components and the challenges of securing complex, concurrent operating systems.
    続きを読む 一部表示
    13 分
  • Automating the Autopsy: The Promise and Peril of AI-Generated Postmortems
    2026/05/01
    This episode explores the intriguing concept of using AI to write incident postmortems, highlighting its potential for speed, consistency, and automating data synthesis from vast sources. However, it also delves into the significant perils, such as the impact of poor data quality, the risk of AI hallucinations, and AI's inability to grasp the nuanced human "why" behind incidents. Listeners will learn about the dichotomy between AI's data processing power and the essential human element in understanding complex system failures.
    続きを読む 一部表示
    13 分
  • The Harness and the Lobotomy: Unpacking Anthropic’s 47-Day Degradation
    2026/04/25
    This episode explores a 47-day incident where Anthropic's Claude Code appeared to degrade, revealing that the core AI model was intact but its 'harness'—the surrounding infrastructure and system prompts—failed. Listeners will learn how critical this 'harness' is for an AI product's effective performance, and how seemingly minor changes, like lowering default reasoning effort, can lead to significant user frustration and a breakdown of trust between a company and its users.
    続きを読む 一部表示
    18 分
  • Scaling for Ghosts: 7 Microservices, 47 Users, and the Trap of Resume-Driven Development
    2026/04/25
    This episode explores the phenomenon of "Resume-Driven Development," where an engineer at a pre-seed startup built an enterprise-grade distributed system designed for 100,000 users, despite only having 47. It highlights how engineers might prioritize resume-boosting complex infrastructure over a startup's actual needs, leading to significant financial and human capital costs. Listeners will learn about the dangers of over-engineering and the critical misalignment of incentives in early-stage tech development.
    続きを読む 一部表示
    15 分
  • The 3,000 Incident Postmortem: Why Caches Are Actually the Enemy
    2026/04/20
    This episode explores Marc Brooker's controversial claim that caching, often a default scaling solution, is a major cause of catastrophic "metastable" system failures. It delves into the importance of deep postmortem analysis, moving beyond superficial root causes to question observability, testing, and fundamental architectural assumptions. Listeners will learn how unquestioning reliance on caching can create systems prone to persistent, unrecoverable breakdowns.
    続きを読む 一部表示
    17 分
  • The Interface Tax: Is Clean Architecture a Scam?
    2026/04/10
    This episode critically explores how dogmatic adherence to "Clean Architecture" principles, such as excessive layering and abstraction, can inadvertently hinder development velocity. It introduces concepts like the "Interface Tax" and "Lasagna Code," illustrating how over-engineering for unlikely future changes creates unnecessary complexity and friction for developers. Listeners will gain a critical perspective on common architectural practices and learn to identify when they might be detrimental to project progress.
    続きを読む 一部表示
    15 分
  • From Vibe-Coded to Enterprise: Handing the Pager to Claude
    2026/04/03
    This episode explores Incident.io's new remote Model Context Protocol (MCP) server, which enables AI assistants like Claude to directly access and interact with live production incident data. Listeners will learn how this "USB-C for AI" standard aims to reduce "dashboard fatigue" and streamline incident response by providing consolidated information, while also considering the potential trade-offs regarding deep system understanding and the "vibe-coded" origin of the technology.
    続きを読む 一部表示
    18 分