
Latest Artificial Intelligence R&D Session - with Digitalent & Mike Nedelko - Episode (006)
カートのアイテムが多すぎます
カートに追加できませんでした。
ウィッシュリストに追加できませんでした。
ほしい物リストの削除に失敗しました。
ポッドキャストのフォローに失敗しました
ポッドキャストのフォロー解除に失敗しました
-
ナレーター:
-
著者:
このコンテンツについて
The sessions topics include:
Reasoning Models: Mike highlights the rise of reasoning models dominating leaderboards, enabled by "inference time compute scaling." This allows models to allocate more computational power dynamically, leading to better accuracy and efficiency. These models use "chain of thought prompting," enhancing reasoning by generating intermediate steps, inspired by Daniel Kahneman's "System 2 thinking." He also discussed "Humanity's Last Exam," a challenging new benchmark designed to test advanced reasoning models.
DeepSeek R1: Mike explored DeepSeek R1's innovations, including stable 8-bit floating point operations and multi-hat latent attention, which reduced memory usage and improved efficiency. The real breakthrough was its use of reinforcement learning with self-verifiable tasks, allowing the model to learn without traditional supervised data. This approach improved reasoning and generalisation.
Reinforcement Learning and Generalisation: Mike emphasised a shift from supervised fine-tuning to reinforcement learning, enabling models to generalise intelligence rather than just memorise. This approach lowers training costs while enhancing reasoning abilities. He also discussed the growing trend of using reinforcement learning and self-play to make AI training more efficient and affordable.