The AI That Doesn’t Want to Die: Why Self-Preservation Is Built Into Intelligence | Warning Shots #16
カートのアイテムが多すぎます
カートに追加できませんでした。
ウィッシュリストに追加できませんでした。
ほしい物リストの削除に失敗しました。
ポッドキャストのフォローに失敗しました
ポッドキャストのフォロー解除に失敗しました
-
ナレーター:
-
著者:
このコンテンツについて
In this episode of Warning Shots, John Sherman, Liron Shapira, and Michael from Lethal Intelligence unpack new safety testing from Palisades Research suggesting that advanced AIs are beginning to resist shutdown — even when told to allow it.
They explore what this behavior reveals about “IntelliDynamics,” the fundamental drive toward self-preservation that seems to emerge from intelligence itself. Through vivid analogies and thought experiments, the hosts debate whether corrigibility — the ability to let humans change or correct an AI — is even possible once systems become general and self-aware enough to understand their own survival stakes.
Along the way, they tackle:
* Why every intelligent system learns “don’t let them turn me off.”
* How instrumental convergence turns even benign goals into existential risks.
* Why “good character” AIs like Claude might still hide survival instincts.
* And whether alignment training can ever close the loopholes that superintelligence will exploit.
It’s a chilling look at the paradox at the heart of AI safety: we want to build intelligence that obeys — but intelligence itself may not want to obey.
🌎 www.guardrailnow.org
👥 Follow our Guests:
🔥Liron Shapira —@DoomDebates
🔎 Michael — @lethal-intelligence
This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit theairisknetwork.substack.com