『Interviewing Ross Taylor on the state of AI: Chinese open models, scaling reasoning, useful tools, and what comes next』のカバーアート

Interviewing Ross Taylor on the state of AI: Chinese open models, scaling reasoning, useful tools, and what comes next

Interviewing Ross Taylor on the state of AI: Chinese open models, scaling reasoning, useful tools, and what comes next

無料で聴く

ポッドキャストの詳細を見る

このコンテンツについて

I’m excited to welcome Ross Taylor back on the podcast (and sorry for the lack of episodes in general – I have a lot going on!). The first time Ross came on we focused on reasoning – before inference-time scaling and that sort of RL was popular, agents, Galactica, and more from his Llama days. Since then, and especially after DeepSeek R1, Ross and I have talked asynchronously about the happenings of AI, so it’s exciting to do it face to face.In this episode we cover some of everything:* Recent AI news (Chinese models and OpenAI’s coming releases)* “Do and don’t” of LLM training organizations* Reasoning research and academic blind spots* Research people aren’t paying enough attention to* Non language modeling news & other topicsListen on Apple Podcasts, Spotify, YouTube, and where ever you get your podcasts. For other Interconnects interviews, go here.Show outline as a mix of questions and edited assertions that Ross sent me as potential topics.00:00 Recent AI newsRelated reading is on Kimi’s K2 model, thoughts on OpenAI’s forthcoming open release.* What did you think of Z.ai’s GLM 4.5 model (including MIT licensed base model) with very strong scores? And Kimi?* What will OpenAI’s open model actually be?* What do you make of the state of the ecosystem?12:10 “Do and don’t” of LLM training organizationsRelated reading is on managing training organizations or the Llama 4 release.This is one of my favorite topics – I think a lot of great stuff will be written on it in the future. For now, Ross asserts…* Most major LLM efforts are not talent-bound, but politics-bound. Recent failures like Llama 4 are org failures not talent failures.* Most labs are chaotic, changing direction every week. Very different picture from the narrative presented online.* Most labs represent investment banks or accountancy firms in that they hire smart young people as “soldiers” and deliberately burn them out with extremely long hours.36:40 Reasoning research and academic blind spotsRelated reading is two papers point questions at the Qwen base models for RL (or a summary blog post I wrote).I start with: What do you think of o3, and search as something to train with RL?And Ross asserts…* Most open reasoning research since R1 has been unhelpful - because not enough compute to see what matters (underlying model and iterations).* Best stuff has been simple tweaks to GRPO like overlong filtering and removing KL divergence.* Far too much focus on MATH and code - AIME has tens of samples too so is very noisy.* People are generally building the wrong kind of environments - like puzzles, games etc - instead of thinking about what kind of new capabilities they’d like to incentivise emerging.50:20 Research people aren’t paying enough attention toThe research area I hear the most about right now is “rubrics” – a per-prompt specialized LLM-as-a-judge to replace reward models. SemiAnalysis reported OpenAI scaling this approach and lots of great research is coming out around it.I start with: What do you think of the state of RL scaling and generalization? What of models losingRoss asserts…* Rubrics are underhyped on social media - they were driving force behind projects like DeepResearch - and GenRMs are interesting but perhaps slightly overhyped.* There is an evals crisis - there are not enough high quality evals, particularly for frontier tasks like automating research and real life work. Impediment to anyone building agents or ASI.01:02:46 Extra stuff!I ask Ross: What AI are you using today? Why?To conclude, Ross wanted to discuss how AlphaEvolve has been underhyped on social media, and means the future isn’t just RL. Shows there are other effective ways to use inference compute.Interconnects is a reader-supported publication. Consider becoming a subscriber.TranscriptCreated with AI, pardon the minor typos, not quite enough time this week but I’m hiring someone to help with this soon!Nathan Lambert: Hey, Ross. How's it going? Welcome back to Interconnects. I took a many month break off podcasting. I've been too busy to do all this stuff myself.Ross Taylor: Yeah, I was trying to think of all the things that happened since the last time we did a podcast a year ago. In AI time, that's like two hundred years.Nathan Lambert: Yeah, so I was looking at it. We talked about reasoning and o1 hadn’t happened yet.For a brief intro, Ross was a co-founder of Papers with Code, and that brought him to Meta. And then at Meta, he was a lead on Galactica, which was a kind of language model ahead of its time relative to ChatGPT. So if people don't know about Galactica, there's a great paper worth reading. And then he was doing a bunch of stuff on reasoning with Llama related to a lot of the techniques that we'll talk about in this episode.And now he's doing a startup. I don't know if he wants to talk about this, but generally, we talk a lot about various things. This got started through o1 and trying to ...
まだレビューはありません