『Principles of Evals: The Future of GenAI Evaluation (E.43)』のカバーアート

Principles of Evals: The Future of GenAI Evaluation (E.43)

Principles of Evals: The Future of GenAI Evaluation (E.43)

無料で聴く

ポッドキャストの詳細を見る

LLMs are optimized to sound convincing—not to know when they’re wrong. In this episode, Deanna Emery breaks down why hallucinations are fundamentally tied to how language models work, why confidence is often disconnected from correctness, and how better evaluation strategies can make AI systems more reliable in production. We also get into uncertainty, semantic reasoning, and what humans still do better than models.

00:00 — Why LLMs hallucinate confidently
09:00 — The limits of current eval systems
18:00 — Why uncertainty matters in AI
27:00 — Semantic reasoning vs memorization
38:00 — What humans still do better than models

The biggest risk in AI isn’t wrong answers. It’s wrong answers delivered with confidence.

adbl_web_anon_alc_button_suppression_c
まだレビューはありません