『When AI Becomes Your SRE: How Incident.io Is Automating Incident Response』のカバーアート

When AI Becomes Your SRE: How Incident.io Is Automating Incident Response

When AI Becomes Your SRE: How Incident.io Is Automating Incident Response

無料で聴く

ポッドキャストの詳細を見る

このコンテンツについて

When your site goes down, every second counts. For years, Incident.io has helped engineering teams coordinate through chaos—getting the right people in the room, keeping stakeholders informed, and restoring order fast. Now, they’re building something new: an AI SRE that can actually help diagnose and respond to incidents. In this episode, Teresa Torres talks with Lawrence Jones (Founding Engineer) and Ed Dean (Product Lead for AI) about how their team is teaching AI to think like a site reliability engineer. They share how they went from simple prototypes that summarized incidents to a multi-agent system that forms hypotheses, tests them, and even drafts fixes—all from within Slack. You’ll hear how they: - Identify which parts of debugging can safely be automated - Combine retrieval, tagging, and re-ranking to find relevant context fast - Use post-incident “time travel” evals to measure how well their AI performed - Balance human trust and AI confidence inside high-stakes workflows This is a masterclass in designing AI systems that think, reason, and collaborate like expert teammates.
まだレビューはありません