『AI Research Today』のカバーアート

AI Research Today

AI Research Today

著者: Aaron
無料で聴く

このコンテンツについて

AI Research Today unpacks the latest advancements in artificial intelligence, one paper at a time. We go beyond abstracts and headlines, walking through architectures, experiments, training details, ablations, failure modes, and the implications for future work. Each episode will choose between one and three new, impactful research papers and go through them in depth. We will discuss the papers at the level of an industry practitioner or AI researcher. If you want to understand the newest topics in AI research but don't have the time to dig through the papers yourself, this is your solution.

© 2025 AI Research Today
科学
エピソード
  • Transformer-Squared: Self-Adaptive LLMs
    2025/12/11

    Send us a text

    In this episode we’re diving into “Transformer-Squared: Self-Adaptive LLMs” — a new framework for adapting large language models to unseen tasks on the fly by tuning only a small part of their weights. The central idea is Singular Value Fine-Tuning (SVF), a parameter-efficient fine-tuning technique that decomposes each weight matrix with Singular Value Decomposition (SVD) and then only trains a small vector that scales the singular values. These vectors become compact “expert” modules that specialize in different tasks and, unlike traditional methods like LoRA, can be composed, mixed, and reused because they’re in a principled, orthogonal basis.

    During inference, Transformer-Squared runs a two-pass process — the first pass identifies the task or context, and the second pass combines the appropriate expert vectors to dynamically adapt the model’s behavior in real time. Across benchmarks and architectures, SVF consistently outperforms LoRA despite requiring orders of magnitude fewer parameters, and the framework even shows versatility on multimodal tasks like vision-language.

    If you’re into efficient adaptation, reinforcement-learning optimization of model components, and self-organizing AI systems, this paper is a big step toward real-time adaptive foundation models. Read the full paper here: https://arxiv.org/pdf/2501.06252

    続きを読む 一部表示
    40 分
  • Nested Learning: The Illusion of Deep Learning Architectures
    2025/12/01

    Send us a text

    NL.pdf

    In this episode, we dive into Nested Learning (NL) — a new framework that rethinks how neural networks learn, store information, and even modify themselves. While modern language models have made remarkable progress, fundamental questions remain: How do they truly memorize? How do they improve over time? And why does in-context learning emerge at scale?

    Nested Learning proposes a bold answer. Instead of viewing a model as a single optimization problem, NL treats it as a hierarchy of nested, multi-level learning processes, each with its own evolving context flow. This perspective sheds new light on how deep models compress information, how in-context learning arises naturally, and how we might build systems with richer, higher-order reasoning abilities.

    We explore the paper’s three major contributions:

    • Deep Optimizers — A reinterpretation of classic optimizers like Adam and SGD-Momentum as associative memory systems that compress gradients. The authors introduce deeper, more expressive optimizers built directly from NL principles.

    • Self-Modifying Titans — A new type of sequence model that learns not just from data, but from its own update rules, enabling it to modify itself during training.

    • Continuum Memory System — A unified framework that extends the idea of short- vs long-term memory into a continuous space. Combined with self-modifying models, it leads to HOPE, a learning module showing strong results in language modeling, continual learning, and long-context reasoning.

    This episode breaks down what NL means for the future of AI, why it’s mathematically transparent and neuroscientifically inspired, and how it might open a new dimension in deep learning research.

    続きを読む 一部表示
    50 分
  • AgentEvolver: An Autonomous Agent Framework
    2025/11/24

    Send us a text

    https://arxiv.org/pdf/2511.10395

    What if AI agents could teach themselves? In this episode, we dive into AgentEvolver, a groundbreaking framework from Alibaba's Tongyi Lab that flips the script on how we train autonomous AI agents.

    Traditional agent training is brutal: you need manually crafted datasets, expensive random exploration, and mountains of compute. AgentEvolver introduces a self-evolving system with three elegant mechanisms that let the LLM drive its own learning:

    Self-Questioning – The agent explores environments and generates its own tasks through curiosity-driven interaction, eliminating the need for hand-crafted training data.

    Self-Navigating – Instead of random exploration, the agent builds an experience pool, retrieves relevant past solutions, and uses hybrid rollouts that mix experience-guided and vanilla trajectories. They tackle the off-policy learning problem with selective boosting for high-performing trajectories.

    Self-Attributing – Fine-grained credit assignment that goes beyond simple trajectory-level rewards, using step-level attribution to figure out which specific actions and states actually contributed to success.

    We break down the advantage calculation mechanics, discuss how they handle the inference/learning sample mismatch through experience stripping, and explore why broadcasting trajectory advantages to token-level might be leaving performance on the table.

    The results are compelling: their 7B model outperforms much larger baselines on AppWorld and BFCL-v3 benchmarks while reducing training steps by up to 67%. This isn't just another incremental improvement – it's a fundamental shift from human-engineered training pipelines to LLM-guided self-improvement.

    Key topics: reinforcement learning for LLMs, experience replay, credit assignment, autonomous task generation, agent systems, GRPO/PPO optimization

    続きを読む 一部表示
    42 分
まだレビューはありません