エピソード

  • Karpathy's AI Divide: Why We're Summoning "Ghosts," Agents Will Take a Decade, and the Brutal "March of Nines"
    2025/10/18

    The podcast provides an extensive interview transcript with Andrej Karpathy, discussing his views on the future of Large Language Models (LLMs) and AI agents. Karpathy argues that the full realization of competent AI agents will take a decade, primarily due to current models' cognitive deficits, lack of continual learning, and insufficient multimodality. He contrasts the current approach of building "ghosts" through imitation learning on internet data with the biological process of building "animals" through evolution, which he refers to as "crappy evolution." The discussion also explores the limitations of reinforcement learning (RL), the importance of a cognitive core stripped of excessive memory, and the need for better educational resources like his new venture, Eureka, which focuses on building effective "ramps to knowledge."

    続きを読む 一部表示
    15 分
  • 30 Gigawatts and the AI Race: Inside OpenAI's Custom Chip Alliance with Broadcom to Build Compute Abundance
    2025/10/14

    The podcast provides excerpts from an OpenAI podcast episode announcing a major partnership between OpenAI and Broadcom to develop custom artificial intelligence infrastructure. This collaboration, which has been ongoing for approximately 18 months, focuses on designing a new custom chip and a complete vertical system to support advanced AI workloads. Speakers from both companies, including Sam Altman and Hock Tan, emphasize the immense scale of this undertaking, with plans to deploy 10 incremental gigawatts of computing capacity starting in late next year, which they describe as one of the largest joint industrial projects in human history. The goal of this partnership is to optimize the entire computing stack—from the transistor design to the final token output—to achieve greater efficiency, lower costs, and ultimately make advanced intelligence more accessible to the world. They view this effort as building a critical utility akin to railroads or the internet, essential for accelerating progress toward artificial general intelligence (AGI).

    続きを読む 一部表示
    10 分
  • AI's Tectonic Shift: The State of AI 2025—Superintelligence Race, Open Source Tsunami, and the Looming Cybersecurity Crisis
    2025/10/11

    The podcast provides an extensive overview of the State of AI for 2025, presented by Nathan Benaich, General Partner of Air Street Capital. This material, which is drawn from a long-form video presentation and associated report, meticulously analyzes recent developments across AI research, industry, politics, and safety. Key research narratives include the rapid progress of OpenAI and the narrowing gap by open-source models like those from Alibaba, as well as breakthroughs in verifiable Reinforcement Learning and applications in scientific discovery. The industrial focus is on the shift from AGI to the pursuit of superintelligence, the impressive revenue generation by AI-first startups, and the crucial economic and political influence of Nvidia and the demand for computational resources. Finally, the report examines the evolving regulatory landscape, including the US government's new technology export strategies and the growing, underfunded issue of AI safety and cyber security risks, while also sharing data from a large survey of AI practitioners' usage and challenges.

    続きを読む 一部表示
    14 分
  • Gemini 2.5 Computer Use Model: How Google's New AI Agent Is Learning to 'Live' Inside Your Browser and Conquer the Messy Web
    2025/10/09

    The podcast discusses the launch and implications of Google's Gemini 2.5 Computer Use model, a specialized AI built on Gemini 2.5 Pro designed to interact directly with user interfaces (UIs), such as filling forms and navigating websites. The official announcement highlights the model's superior performance in web and mobile control benchmarks with low latency, achieved through an iterative loop that analyzes screenshots and executes UI actions. However, a lengthy comment thread reveals mixed experiences, with some users noting the model’s slow speed and struggles with complex tasks like CAPTCHA solving, while others recognize its potential for workflow automation and UI testing, despite its current limitations and the inherent inefficiency of automating human-designed interfaces. The discussion also touches upon the critical safety guardrails Google has implemented to manage risks associated with AI agents controlling computers.

    続きを読む 一部表示
    10 分
  • ChatGPT’s New Apps SDK: The Universal UI Dream vs. The Developer's Walled Garden
    2025/10/07

    The podcast provides an extensive overview of guidelines for developers building applications that integrate with ChatGPT, which are referred to as "Apps" and leverage the Model Context Protocol (MCP), allowing for dynamic user interfaces like inline cards, carousels, and fullscreen experiences within the chat environment. The App developer guidelines establish minimum standards centered on trust, privacy, safety, and accountability, while the App design guidelines emphasize best practices for creating seamless, conversational, and visually consistent user experiences within ChatGPT's framework. Simultaneously, an accompanying discussion highlights skepticism about the long-term viability of the chat interface as a universal user experience, noting that while LLMs offer better language comprehension than past chatbots, many tasks may still be better suited for traditional, specialized user interfaces, leading to a debate about whether these micro-apps or traditional utility applications will ultimately dominate user workflows.

    続きを読む 一部表示
    17 分
  • End AI Amnesia: Anthropic's Context Editing and Memory Tool Solve LLM Forgetfulness and Token Limits
    2025/10/06

    The podcast discusses new features on the Claude Developer Platform to enhance agents' ability to manage long-running tasks by addressing context window limitations. Specifically, Anthropic introduces context editing, which automatically removes stale information like old tool results to preserve conversation flow and extend operational time. Additionally, the memory tool allows agents to store and retrieve persistent information outside the primary context window, enabling the creation of long-term knowledge bases and project states across sessions. These capabilities, optimized for the Claude Sonnet 4.5 model, significantly improve agent performance and are shown to boost success rates on complex tasks. The new features are presented as crucial for building sophisticated agents capable of handling large codebases, extensive research, and complex data processing workflows.

    続きを読む 一部表示
    15 分
  • OpenAI's Money Furnace: How $13.5 Billion in Losses Fuels the AI Arms Race and the Inevitable Ad Strategy
    2025/10/04

    The podcast focuses heavily on the financial health and long-term viability of OpenAI, particularly given its substantial revenue of $4.3 billion contrasted with a $13.5 billion net loss in the first half of 2025, which includes massive spending on R&D and employee stock compensation. A central debate revolves around whether the company can successfully monetize its product, ChatGPT, with many participants suggesting that an advertising model is an unavoidable solution to offset the astronomical and rapidly depreciating costs associated with training and running large language models. Further discussion centers on OpenAI's competitive moat, as many contributors argue that the technical lead is narrowing with rivals like Google, Anthropic, and open-source models, leaving brand recognition as the primary advantage against larger, more established companies with massive existing infrastructure and distribution. Ultimately, the future success of OpenAI is framed as a high-stakes, capital-intensive race where sustained profitability seems impossible without a significant shift in revenue strategy or a substantial technological breakthrough like achieving AGI.

    続きを読む 一部表示
    13 分
  • OpenAI Sora 2: Video Generation Advancements and Deployment
    2025/10/01

    The podcast discusses the launch of Sora 2, the company’s advanced video and audio generation model, highlighting its improved capabilities in realism, physics modeling, and controllability. The documents emphasize a strong commitment to responsible deployment, outlining comprehensive safety measures integrated into the new Sora iOS app and its web platform. Key safeguards include visible and invisible provenance signals to identify AI content, strict consent-based likeness controls via a "cameos" feature, and robust content filtering to block harmful material. Furthermore, the sources discuss the Sora feed philosophy, which is designed to prioritize creativity and social connection over passive consumption, including specific protections for teen users.

    続きを読む 一部表示
    16 分