『Semi Doped』のカバーアート

Semi Doped

Semi Doped

著者: Vikram Sekar and Austin Lyons
無料で聴く

概要

The business and technology of semiconductors. Alpha for engineers and investors alike.

© 2026 Semi Doped
エピソード
  • An Interview with Microsoft's Saurabh Dighe About Maia 200
    2026/01/28

    Maia 100 was a pre-GPT accelerator.
    Maia 200 is explicitly post-GPT for large multimodal inference.

    Saurabh Dighe says if Microsoft were chasing peak performance or trying to span training and inference, Maia would look very different. Higher TDPs. Different tradeoffs. Those paths were pruned early to optimize for one thing: inference price-performance. That focus drives the claim of ~30% better performance per dollar versus the latest hardware in Microsoft’s fleet.

    Intereting topics include:
    • What “30% better price-performance” actually means
    • Who Maia 200 is built for
    • Why Microsoft bet on inference when designing Maia back in 2022/2023
    • Large SRAM + high-capacity HBM
    • Massive scale-up, no scale-out
    • On-die NIC integration

    Maia is a portfolio platform: many internal customers, varied inference profiles, one goal. Lower inference cost at planetary scale.

    Chapters:
    (00:00) Introduction
    (01:00) What Maia 200 is and who it’s for
    (02:45) Why custom silicon isn’t just a margin play
    (04:45) Inference as an efficient frontier
    (06:15) Portfolio thinking and heterogeneous infrastructure
    (09:00) Designing for LLMs and reasoning models
    (10:45) Why Maia avoids training workloads
    (12:00) Betting on inference in 2022–2023, before reasoning models
    (14:40) Hyperscaler advantage in custom silicon
    (16:00) Capacity allocation and internal customers
    (17:45) How third-party customers access Maia
    (18:30) Software, compilers, and time-to-value
    (22:30) Measuring success and the Maia 300 roadmap
    (28:30) What “30% better price-performance” actually means
    (32:00) Scale-up vs scale-out architecture
    (35:00) Ethernet and custom transport choices
    (37:30) On-die NIC integration
    (40:30) Memory hierarchy: SRAM, HBM, and locality
    (49:00) Long context and KV cache strategy
    (51:30) Wrap-up

    続きを読む 一部表示
    53 分
  • Can Pre-GPT AI Accelerators Handle Long Context Workloads?
    2026/01/26

    OpenAI's partnership with Cerebras and Nvidia's announcement of context memory storage raises a fundamental question: as agentic AI demands week-long sessions with massive context windows, can SRAM-based accelerators designed before the LLM era keep up—or will they converge with GPUs?

    Key Takeaways
    1. Context is the new bottleneck. As agentic workloads demand long sessions with massive codebases, storing and retrieving KV cache efficiently becomes critical.
    2. There's no one-size-fits-all. Sachin Khatti's (OpenAI, ex-Intel) signals a shift toward heterogeneous compute—matching specific accelerators to specific workloads.
    3. Cerebras has 44GB of SRAM per wafer — orders of magnitude more than typical chips — but the question remains: where does the KV cache go for long context?
    4. Pre-GPT accelerators may converge toward GPUs. If they need to add HBM or external memory for long context, some of their differentiation erodes.
    5. Post-GPT accelerators (Etched, MatX) are the ones to watch. Designed specifically for transformer inference, they may solve the KV cache problem from first principles.

    Chapters
    - 00:00 — Intro
    - 01:20 — What is context memory storage?
    - 03:30 — When Claude runs out of context
    - 06:00 — Tokens, attention, and the KV cache explained
    - 09:07 — The AI memory hierarchy: HBM → DRAM → SSD → network storage
    - 12:53 — Nvidia's G1/G2/G3 tiers and the missing G0 (SRAM)
    - 14:35 — Bluefield DPUs and GPU Direct Storage
    - 15:53 — Token economics: cache hits vs misses
    - 20:03 — OpenAI + Cerebras: 750 megawatts for faster Codex
    - 21:29 — Why Cerebras built a wafer-scale engine
    - 25:07 — 44GB SRAM and running Llama 70B on four wafers
    - 25:55 — Sachin Khatti on heterogeneous compute strategy
    - 31:43 — The big question: where does Cerebras store KV cache?
    - 34:11 — If SRAM offloads to HBM, does it lose its edge?
    - 35:40 — Pre-GPT vs Post-GPT accelerators
    - 36:51 — Etched raises $500M at $5B valuation
    - 38:48 — Wrap up

    続きを読む 一部表示
    38 分
  • An Interview with Innoviz CEO Omer Keilaf about current LiDAR market dynamics
    2026/01/22

    Innoviz CEO Omer Keilaf believes the LIDAR market is down to its final players—and that Innoviz has already won its seat.

    In this conversation, we cover the Level 4 gold rush sparked by Waymo, why stalled Level 3 programs are suddenly accelerating, the technical moat that separates L4-grade LIDAR from everything else, how a one-year-old startup won BMW, and why Keilaf thinks his competitors are already out of the race.

    Omer Keilaf founded Innoviz in 2016. Today it's a publicly traded Tier 1 supplier to BMW, Volkswagen, Daimler Truck, and other global OEMs.

    Chapters
    00:00 Introduction
    00:17 Why Start a LIDAR Company in 2016?
    01:32 The Personal Story Behind Innoviz
    03:12 Transportation Is Still Our Biggest Daily Risk
    04:28 The 2012 Spark: Xbox Kinect and 3D Sensing
    06:32 From Mobile to Automotive: Finding the Right Platform
    07:54 "I Didn't Know What LIDAR Was, But I'd Do It Better"
    08:19 How a One-Year-Old Startup Won BMW
    10:04 Surviving the First Product
    11:23 From Tier 2 to Tier 1: The Volkswagen Win
    13:47 Lessons Learned Scaling Through Partners
    14:45 The SPAC Decision: A Wake-Up Call from a Competitor
    16:42 From 200 LIDAR Companies to a Handful
    17:27 NREs: How Tier 1 Status Funds R&D
    18:44 Why Automotive-First Is the Right Strategy
    19:45 Consolidation Patterns: Cameras, Radars, Airbags
    20:31 "The Music Has Stopped"
    21:07 Non-Automotive: Underserved Markets
    23:51 Working with Secretive OEMs
    25:27 The Press Release They Tried to Stop
    26:42 CES 2025: 85% of Meetings Were Level 4
    27:40 Why Level 3 Programs Are Suddenly Accelerating
    28:33 The EV/ADAS Coupling Problem
    29:49 Design Is Everything: The Holy Grail Is Behind the Windshield
    31:13 The Three-Year RFQ: Grill → Roof → Windshield
    32:32 Innoviz3: Small Enough for Behind-the-Windshield
    34:40 Innoviz2 for L4, Innoviz3 for Consumer L3
    36:38 What's the Real Difference Between L2, L3, and L4 LIDAR?
    38:51 The Mud Test: Why L4 Demands 100% Availability
    40:50 "We're the Only LIDAR Designed for Level 4"
    42:52 Patents and the Maslow Pyramid of Autonomy
    44:15 Non-Automotive Markets: Agriculture, Mining, Security
    46:15 Closing

    続きを読む 一部表示
    47 分
まだレビューはありません