Next in AI: Your Daily News Podcast | ポッドキャスト on Audible | Audible.co.jp

エピソード

Meta REFRAG: 30x Faster and Smarter Knowledge Access

2025/09/09

Tune into "REFRAG: Rethinking RAG Decoding" to discover a cutting-edge framework revolutionizing Retrieval-Augmented Generation (RAG) in Large Language Models (LLMs). Learn how REFRAG tackles the challenges of long-context inputs, which typically cause high latency and memory demands.

This podcast explores REFRAG's innovative "compress, sense, and expand context" approach, leveraging attention sparsity in RAG contexts. We'll discuss its use of pre-computed chunk embeddings and a lightweight reinforcement learning (RL) policy to selectively determine necessary token input, reducing computationally intensive processes.

Discover how REFRAG achieves up to 30.85× time-to-first-token (TTFT) acceleration (3.75× over previous methods) and extends LLM context size by 16× without losing accuracy. Join us to understand how REFRAG offers a practical and scalable solution for latency-sensitive, knowledge-intensive LLM applications

続きを読む一部表示

20 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
OpenAI: Why LLM Hallucinates and How Our Tests Make It Worse

2025/09/07

Why do AI chatbots confidently make up facts?
This podcast explores the surprising reasons language models 'hallucinate'. We'll uncover how these plausible falsehoods originate from statistical errors during pretraining and persist because evaluations reward guessing over acknowledging uncertainty. Learn why models are optimized to be good test-takers, much like students guessing on an exam, and what it takes to build more trustworthy AI systems.

続きを読む一部表示

16 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
Beyond Chatbots: Building Robust LLM Agents with LangGraph

2025/09/06

Dive into LangGraph, the production-ready agent runtime designed to give you control and durability over your AI agents. Discover how LangGraph addresses the unique challenges of slow, flaky, and open-ended LLMs with features like parallelization, streaming, checkpointing, and human-in-the-loop. Whether you're building simple routers, dynamic tool-calling agents (like ReAct), or custom agent architectures, learn how to create sophisticated, task-specific systems that scale effectively and continuously improve.

続きを読む一部表示

20 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
The Gemmaverse Unleashed: Private, Powerful AI in Your Pocket

2025/09/05
Welcome to the "Gemmaverse Unlocked" podcast! Dive into the world of Google's Gemma family of open models, where State-of-the-Art AI meets On-Device and Offline capabilities.
Join us as we explore:
EmbeddingGemma: The best-in-class, mobile-first embedding model designed for private, efficient semantic search and RAG pipelines directly on your hardware, even without internet connection.
Gemma 3 270M: A compact, hyper-efficient model that sets new performance levels for its size in instruction following, enabling specialized, on-device applications with extreme energy efficiency and enhanced user privacy.
Gemma 3n: A groundbreaking, mobile-first multimodal architecture bringing powerful image, audio, video, and text understanding to edge devices, with SOTA performance previously seen only in cloud models.Discover how these models empower developers to build private, fast, and accessible AI experiences on everyday devices. Learn about the innovations making sophisticated AI possible directly on your phone, laptop, or desktop, unlocking a new era of generative AI!
続きを読む一部表示
14 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く
Unpacking Implicit Reasoning: The Silent, Speedy Revolution in LLM Thinking

2025/09/05
Decoding the Silent Mind: Implicit Reasoning in LLMs
Discover Implicit Reasoning, the cutting-edge method where Large Language Models (LLMs) solve complex, multi-step problems silently, using internal latent structures, without generating intermediate textual steps.Move beyond verbose "Chain-of-Thought" (CoT) prompting! Implicit reasoning offers significant benefits:
Lower generation cost and faster inference.
Better alignment with internal computation.
Enhanced resource efficiency.
Ability to explore more diverse reasoning paths internally, free from language constraints.

We'll explore a novel taxonomy of implicit reasoning, focusing on execution paradigms such as latent optimization, signal-guided control, and layer-recurrent execution. Learn about the structural, behavioral, and representation-based evidence supporting its existence within LLMs.
While promising, we'll also touch on challenges like limited interpretability, control, and the performance gap compared to explicit reasoning.
Tune into "Decoding the Silent Mind" to understand how LLMs "think" beneath the surface, driving towards more efficient and robust AI.
続きを読む一部表示
20 分

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

無料で聴く