The AI Morning Read January 27, 2026 - Heavy Thinking, Long Memory: Inside the 560B Model Teaching AI to Reason at Scale
カートのアイテムが多すぎます
カートに追加できませんでした。
ウィッシュリストに追加できませんでした。
ほしい物リストの削除に失敗しました。
ポッドキャストのフォローに失敗しました
ポッドキャストのフォロー解除に失敗しました
-
ナレーター:
-
著者:
概要
In today's podcast we deep dive into LongCat-Flash-Thinking-2601, a massive 560-billion-parameter open-source Mixture-of-Experts model designed to push the boundaries of agentic reasoning and complex tool use. This model achieves state-of-the-art performance on difficult benchmarks like BrowseComp and $\tau^2$-Bench by utilizing a unified training framework that combines domain-parallel expert training with fusion. Its creators employed a unique approach involving "environment scaling" across over 20 domains and deliberately injected real-world noise into the training process to ensure the model remains robust in imperfect environments. To tackle the hardest problems, the model features a "Heavy Thinking" mode that scales test-time computation by expanding both the depth and width of its reasoning through parallel exploration. Finally, we explore the experimental "Zig-Zag Attention" design that allows this system to efficiently handle ultra-long contexts of up to 1 million tokens, cementing its status as a leading tool for long-horizon agentic workflows.