AI in the shadows: From hallucinations to blackmail

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

AI in the shadows: From hallucinations to blackmail

無料で聴く

ポッドキャストの詳細を見る

このコンテンツについて

In the first episode of an "AI in the shadows" theme, Chris and Daniel explore the increasing concerning world of agentic misalignment. Starting out with a reminder about hallucinations and reasoning models, they break down how today’s models only mimic reasoning, which can lead to serious ethical considerations. They unpack a fascinating (and slightly terrifying) new study from Anthropic, where agentic AI models were caught simulating blackmail, deception, and even sabotage — all in the name of goal completion and self-preservation.

Featuring:

Chris Benson – Website, LinkedIn, Bluesky, GitHub, X
Daniel Whitenack – Website, GitHub, X

Links:

Agentic Misalignment: How LLMs could be insider threats
Hugging Face Agents Course

Register for upcoming webinars here!