Week 43, mid week episode
カートのアイテムが多すぎます
カートに追加できませんでした。
ウィッシュリストに追加できませんでした。
ほしい物リストの削除に失敗しました。
ポッドキャストのフォローに失敗しました
ポッドキャストのフォロー解除に失敗しました
-
ナレーター:
-
著者:
このコンテンツについて
In this comprehensive episode, we merge two days of rapid-fire AI developments into one essential deep dive. We explore the groundbreaking new frameworks that are finally helping to define and measure Artificial General Intelligence (AGI) and revealing the surprising limits of "reflection" in today's most advanced models. We'll cover major breakthroughs in healthcare, where AIs like DeepSomatic are finding previously missed cancer variants in children, and in law, where new systems are redefining corporate accountability in the age of algorithms. From clever 'jailbreaking' attacks that expose critical vulnerabilities to the fun side of AI playing Dungeons & Dragons, this episode covers the innovations, the risks, and the societal shifts shaping our world. Sharing a lot of references from this episode.
- References and Further Reading
- A Definition of AGI: Proposes a quantifiable framework to define and measure Artificial General Intelligence based on human cognitive abilities.
- DeepSomatic: Details Google's AI model that identified 10 previously missed genetic variants in pediatric leukemia cells.
- Distractor Injection Attacks: Reveals how top LLMs can be distracted by irrelevant tasks, cutting task accuracy by up to 60%.
- DTKG: A dual-track knowledge graph framework that improves complex multi-hop question answering in RAG systems.
- From Local to Global (GISP): Introduces GISP, a structured pruning method making LLMs up to 50% smaller without losing performance.
- FST.ai 2.0: An explainable AI system to assist Taekwondo referees, reducing decision review times by 85%.
- Illusions of reflection: Shows that frontier LLMs lack functional, goal-driven reflective reasoning, a key gap in current AI capabilities.
- Is Multilingual LLM Watermarking Truly Multilingual? (STEAM): Presents STEAM, a method using back-translation to fix fairness issues and ensure watermarking works in low-resource languages.
- Na Prática, qual IA Entende o Direito?: A study finding that a specialized legal AI (JusIA) significantly outperforms general models like ChatGPT on legal tasks.
- Operationalising Extended Cognition: Proposes a legal framework for holding corporations accountable for decisions made by their AI systems.
- The Right to Be Remembered: Argues for a digital right to combat the erasure of minority voices and cultural memory by LLMs.
- Team-Phi: A multi-agent framework that automatically evaluates and selects models for anonymizing patient health data.
- VERA-V: A framework that automates the discovery of 'jailbreak' vulnerabilities in multimodal AIs like GPT-4o.
- What Limits Agentic Systems Efficiency? (SpecCache): Introduces SpecCache, a method to speed up web-based AI agents by up to 3.2x via intelligent caching.