『AI Explained Official Podcast』のカバーアート

AI Explained Official Podcast

AI Explained Official Podcast

著者: Philip - Host of AI Explained YT
無料で聴く

このコンテンツについて

Covering the biggest news of the century - the arrival of smarter-than-human AI. From the author of Simple Bench, which reveals the remaining gap between LLM and human reasoning. Hype-free, and the British accent is a freebie bonus.

© 2025 AI Explained Official Podcast
個人的成功 政治・政府 社会科学 自己啓発
エピソード
  • What the Freakiness of 2025 in AI Tells Us About 2026
    2025/12/23

    It’s probably not possible to satisfactorily condense a 12 month’s worth of weird progress in AI, as well as predictions for the year to come, into one video. But I’m gonna try anyway because it has been a very strange time.

    http://matsprogram.org/s26-aie


    My new app! https://lmcouncil.ai


    Patreon Interview: https://www.patreon.com/posts/robot-in-your-27-146376094

    Chapters:
    00:00 - Introduction
    00:34 - Reasoning Models … and limits
    02:54 - A playable world
    03:36 - Realism
    03:50 - AI Slop gone mainstream
    05:03 - DolphinGemma
    05:39 - Public Mood
    07:34 - AI Enlisted
    08:30 - GPT-5
    11:05 - Open Weight not out
    13:00 - METR Breakout
    17:30 - VASA-1
    18:28 - Lateral Productivity
    20:15 - 1 or 1000 benchmarks needed?
    24:54 - Continual Learning + Altman on Superintelligence
    28:08 - Automated Information Discovery ft AlphaEvolve


    Hassabis on Generality: https://x.com/demishassabis/status/2003097405026193809
    https://www.youtube.com/watch?v=PqVbypvxDto

    Gemini 3: https://storage.googleapis.com/gweb-uniblog-publish-prod/original_images/gemini_3_table_final_HLE_Tools_on.gif
    Reasoning Trade-offs: https://arxiv.org/pdf/2504.13837

    DolphinGemma: https://blog.google/technology/ai/dolphingemma/?s=09

    Genie 3: https://deepmind.google/blog/genie-3-a-new-frontier-for-world-models/

    METR Time Horizon: https://arxiv.org/pdf/2503.14499
    https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/
    Flaws: https://x.com/ShashwatGoel7/status/2002369517499105443
    https://shash42.substack.com/p/how-to-game-the-metr-plot
    https://x.com/METR_Evals/status/2002203627377574113

    GPT-5 - Altman phd in everything: https://edition.cnn.com/2025/08/14/business/chatgpt-rollout-problems

    https://simple-bench.com/

    AI Slop: https://www.youtube.com/watch?v=I_3vxoJDD9k
    https://www.theguardian.com/technology/2025/dec/16/boost-for-artists-in-ai-copyright-battle-as-only-3-per-cent-back-uk-active-opt-out-plan

    Survey: https://x.com/SearchlightInst/status/2001057144842387920/photo/1

    Nvidia Nemotron: https://x.com/percyliang/status/2000608134205985169

    OpenAI Compute Flywheel: https://x.com/OpenAI/status/2001363007209914399/photo/1
    Altman Interview: https://www.youtube.com/watch?v=2P27Ef-LLuQ

    AI in Govt: https://x.com/jdcmedlock/status/1939814516503847259

    Benchmark Gaming: https://techcrunch.com/2025/04/07/meta-exec-denies-the-company-artificially-boosted-llama-4s-benchmark-scores/

    AlphaEvolve: https://deepmind.google/blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/
    https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaevolve-a-gemini-powered-coding-agent-for-designing-advanced-algorithms/AlphaEvolve.pdf?utm_source=deepmind.google&utm_medium=referral&utm_campaign=gdm&utm_content=
    Continual Learning: https://abehrouz.github.io/files/NL.pdf

    Job Risk: https://archive.ph/20250708204527/https://www.axios.com/2025/05/28/ai-jobs-white-collar-unemployment-anthropic

    GPT4o: https://x.com/AISafetyMemes/status/1916889492172013989

    Vasa-1: https://www.microsoft.com/en-us/research/project/vasa-1/

    Three Views: https://www.lesswrong.com/posts/K2D45BNxnZjdpSX2j/ai-timelines
    Turing Test: https://x.com/tunguz/status/1907185471211422147

    Karpathy Year in Review: https://karpathy.bearblog.dev/year-in-review-2025/

    LLM Brainrot: https://arxiv.org/pdf/2510.13928

    Lateral Productivity: https://www.aisi.gov.uk/frontier-ai-trends-report

    Emotional Quotient: https://arxiv.org/pdf/2511.08394

    Non-hype Newsletter: https://signaltonoise.beehiiv.com/

    Podcast: https://aiexplainedopodcast.buzzsprout.com/


    AI Insiders ($9!): https://www.patreon.com/AIExplained

    続きを読む 一部表示
    33 分
  • Gemini Exponential, Demis Hassabis' ‘Proto-AGI’ coming, but …
    2025/12/19

    The condensed highlights of hours of AI lab leader interviews, model releases, Gemini 3 Flash insights (plus it’s hidden flaw), Hassabis’ ‘proto-AGI’ and much more…

    https://matsprogram.org/apply?utm_source=ai-explained&utm_medium=youtube&utm_campaign=s26

    Also, do check out my new app: https://lmcouncil.ai

    Chapters:
    00:00 - Introduction
    00:50 - Results
    02:44 - But… the Flaw
    04:49 - So Benchmarks are fake? No
    07:37 - Spatial Reasoning + Hassabis
    10:06 - Proto-AGI
    12:07 - Minimal AGI
    15:07 - Compute Slowdown
    17:56 - New Data Paradigm

    Gemini 3 Flash: https://deepmind.google/models/gemini/flash/

    Hassabis Interview: https://www.youtube.com/watch?v=PqVbypvxDto
    Legg Interview: https://www.youtube.com/watch?v=l3u_FAv33G0
    Pre-training Lead Interview: https://www.youtube.com/watch?v=cNGDAqFXvew
    Altman Interview: https://www.youtube.com/watch?v=2P27Ef-LLuQ
    Brockman Video: https://x.com/OpenAI/status/2001336514786017417
    Post-Training Reveal: https://x.com/OfficialLoganK/status/2001742530472534442

    Hallucinations Paper: https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4aaa5/why-language-models-hallucinate.pdf
    Patreon Hallucinations Vid: https://www.patreon.com/posts/blockers-to-and-139264812
    AA-Omniscience Benchmark: https://artificialanalysis.ai/evaluations/omniscience
    https://arxiv.org/pdf/2511.13029


    lmcouncil.ai/benchmarks
    https://simple-bench.com/
    https://x.com/scaling01/status/1999620587744813205

    5.2 Codex Drop: https://cdn.openai.com/pdf/ac7c37ae-7f4c-4442-b741-2eabdeaf77e0/oai_5_2_Codex.pdf

    OpenAI Compute Trend: https://www.theinformation.com/articles/openais-350-billion-computing-cost-problem?rc=sy0ihq

    Cramer Tweet/Response: https://x.com/BorisMPower/status/2001440650210976018

    OpenAI Valuation: ​​https://www.theinformation.com/articles/openai-discussed-raising-tens-billions-valuation-around-750-billion?rc=sy0ihq

    Indian Data: https://www.reuters.com/world/india/with-freebies-openai-google-vie-indian-users-training-data-2025-12-17/

    TheInformation Data: https://x.com/theinformation/status/2001421225751351778

    Genie 3: https://deepmind.google/blog/genie-3-a-new-frontier-for-world-models/
    Sima 2: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/
    Veo 3.1: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/

    METR: https://metr.org/blohttps://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks/2025-03-19-measuring-ai-ability-to-complete-long-tasks/


    AI Insiders ($9!): https://www.patreon.com/AIExplained


    Non-hype Newsletter: https://signaltonoise.beehiiv.com/

    続きを読む 一部表示
    20 分
  • GPT 5.2: OpenAI Strikes Back
    2025/12/12

    Full GPT-5.2 breakdown - did OpenAI reclaim the crown? A story of tokens, time and cost, plus 9 details you wouldn’t get just from reading the headlines.

    https://www.youtube.com/@eightythousandhours



    AI Insiders ($9!): https://www.patreon.com/AIExplained
    https://lmcouncil.ai

    Chapters:
    00:00 - Introduction
    00:55 - Better than Human @ Professional Tasks?
    04:42 - Test time Compute
    07:05 - Benchmark Selection
    09:32 - Simple Results + council comparison
    13:01 - Long Context
    13:52 - Self-Improvement
    15:00 - 10 Years + New Models

    Release Page: https://openai.com/index/introducing-gpt-5-2/

    GPT 5.2 Benchmark Comparison: https://www.reddit.com/r/singularity/comments/1pka1y9/gpt52_all_20_benchmarks_rankings_and_pricing/
    https://storage.googleapis.com/gweb-uniblog-publish-prod/original_images/gemini_3_table_final_HLE_Tools_on.gif
    https://lmcouncil.ai/benchmarks

    Charxiv: https://charxiv.github.io/#leaderboard

    GDPval: https://arxiv.org/pdf/2510.04374
    My vid: https://www.youtube.com/watch?v=oK5LxMaROSA

    Kilpatrick: https://x.com/OfficialLoganK/status/1999270402712023158/photo/1

    Noam Brown: https://x.com/polynoamial/status/1999189845164667132

    New Model in New Year: https://www.theinformation.com/articles/openai-developing-garlic-model-counter-googles-recent-gains?rc=sy0ihq

    10 Years of OpenAI: https://openai.com/index/ten-years/

    GPQA: https://x.com/idavidrein/status/1841265634170278063

    ARC-AGI 1-2: https://arcprize.org/arc-agi/2/

    Sunday Robotics: https://x.com/tonyzzhao/status/1991204839578300813


    Non-hype Newsletter: https://signaltonoise.beehiiv.com/


    https://lmcouncil.ai

    続きを読む 一部表示
    18 分
まだレビューはありません