The Superposition Problem

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

The Superposition Problem

無料で聴く

ポッドキャストの詳細を見る

このコンテンツについて

This episode of "Two Minds, One Model" explores the critical concept of interpretability in AI systems, focusing on Anthropic's research paper "Toy Models of Superposition." Hosts John Jezl and Jon Rocha from Sonoma State University's Computer Science Department delve into why neural networks are often "black boxes" and what this means for AI safety and deployment.

Credits

Cover Art by Brianna Williams

TMOM Intro Music by Danny Meza

A special thank you to these talented artists for their contributions to the show.

—---------------------------------------------------

Links and Reference

Academic Papers

“Toy Models of Superposition” - Anthropic (December 2022)
"Alignment Faking in Large Language Models" - Anthropic (December 2024)
"Agentic Misalignment: How LLMs Could Be Insider Threats" - Anthropic (January 2025)

News

https://www.npmjs.com/package/@anthropic-ai/claude-code
https://www.wired.com/story/thinking-machines-lab-first-product-fine-tune/
https://www.wired.com/story/chatbots-play-with-emotions-to-avoid-saying-goodbye/

Harvard Business School study on companion chatbots

Misc

“Words are but vague shadows of the volumes we mean”' - Theodore Dreiser
3Blue1Brown video about vectors - https://www.youtube.com/shorts/FJtFZwbvkI4
GPT-3 parameter count Correction: https://en.wikipedia.org/wiki/GPT-3#:~:text=GPT%2D3%20has%20175%20billion,each%20parameter%20occupies%202%20bytes.
ImageNet: ImageNet: A Large-Scale Hierarchical Image Database

We mention Waymo a lot in this episode and felt it was important to link to their safety page: https://waymo.com/safety/

Abandoned Episode Titles

"404: Interpretation Not Found"

"Neurons Gone Wild: Spring Break Edition"

"These Aren't the Features You're Looking For”

"Bigger on the Inside"

まだレビューはありません

特集

カテゴリー別

The Superposition Problem

カートのアイテムが多すぎます

カートに追加できませんでした。

ウィッシュリストに追加できませんでした。

ほしい物リストの削除に失敗しました。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

The Superposition Problem

このコンテンツについて