『Ep. 8 - Building a C Compiler at Anthropic: A Stress Test for AI Reliability』のカバーアート

Ep. 8 - Building a C Compiler at Anthropic: A Stress Test for AI Reliability

Ep. 8 - Building a C Compiler at Anthropic: A Stress Test for AI Reliability

無料で聴く

ポッドキャストの詳細を見る

概要

This is Execution Over Everything. We take AI papers, blog posts, and big ideas that sound incredible on X… and we run them headfirst into reality. Not demos. Not vibes. Not one-shot prompts. We’re asking one question: what happens when this thing runs over and over again, under pressure, in the real world?


In this technical audit, we deconstruct Nicholas Carlini’s experiment where 16 parallel Claudes built a 100,000-line C compiler. We ignore the hype and look at the logs: the $20,000 API bill, the 'suicide' command that killed the harness, and why 16 agents turned into a 'Thunderherd' that clobbered its own code.


If you’re building AI infrastructure today, this is your sanity check on the reality of autonomous agents.


  • 00:00 — Alt Show Intro
  • 00:35 — Cold Open: The $20,000 Suicide * Starts mid-thought with the "GPU bonfire" debate and the incident where an agent ran pkill -9 bash on its own harness.
  • 02:20 — The Claim: 16 Agents vs. A C Compiler * Deconstructing Nicholas Carlini’s goal: building a 100,000-line Rust-based C compiler capable of building the Linux kernel.
  • 06:15 — Hidden Assumptions: Context Pollution & Time Blindness * Discussing why the harness had to "pre-chew" logs to prevent context window pollution and the agents' lack of wall-clock awareness.
  • 09:40 — Execution Reality Check: The Thunderherd Problem * A deep dive into why 16 parallel agents deadlocked and clobbered each other's code when tasked with the monolithic Linux kernel.
  • 14:15 — The Verification Boundary: The Oracle Dependency * Analyzing the "cheat code": using GCC as a known-good oracle to grade the AI’s work during the debugging loop.
  • 18:25 — The 16-Bit Wall: Where Intelligence Fails * The audit of the 16-bit real mode failure, where the AI hit a hard optimization wall it could not reason its way out of.
  • 21:10 — Design Review: Burn Rate & Efficiency * Evaluating the $20,000 API bill for code that remained less efficient than human-written software from 30 years ago.
  • 22:50 — What Builders Should Actually Do * Practical guidance: Focus on building the "jail" (the harness and task verifier) over the agent.
  • 24:10 — Closing Thought: Repetition is the Bottleneck * Sticking the landing on the ironic truth: The intelligence isn’t the bottleneck; persistence is.



  • Anthropic engineering

  • building a C compiler

  • AI and compilers

  • determinism in software

  • AI reliability limits

  • correctness vs productivity

  • systems programming AI

  • execution constraints

  • retries and failure modes

まだレビューはありません