Latest Artificial Intelligence R&D Session - with Digitalent & Mike Nedelko - Episode (006)

カートのアイテムが多すぎます

ご購入は五十タイトルがカートに入っている場合のみです。

カートに追加できませんでした。

しばらく経ってから再度お試しください。

ウィッシュリストに追加できませんでした。

しばらく経ってから再度お試しください。

ほしい物リストの削除に失敗しました。

しばらく経ってから再度お試しください。

ポッドキャストのフォローに失敗しました

ポッドキャストのフォロー解除に失敗しました

Latest Artificial Intelligence R&D Session - with Digitalent & Mike Nedelko - Episode (006)

無料で聴く

ポッドキャストの詳細を見る

このコンテンツについて

The sessions topics include:

Reasoning Models: Mike highlights the rise of reasoning models dominating leaderboards, enabled by "inference time compute scaling." This allows models to allocate more computational power dynamically, leading to better accuracy and efficiency. These models use "chain of thought prompting," enhancing reasoning by generating intermediate steps, inspired by Daniel Kahneman's "System 2 thinking." He also discussed "Humanity's Last Exam," a challenging new benchmark designed to test advanced reasoning models.

DeepSeek R1: Mike explored DeepSeek R1's innovations, including stable 8-bit floating point operations and multi-hat latent attention, which reduced memory usage and improved efficiency. The real breakthrough was its use of reinforcement learning with self-verifiable tasks, allowing the model to learn without traditional supervised data. This approach improved reasoning and generalisation.

Reinforcement Learning and Generalisation: Mike emphasised a shift from supervised fine-tuning to reinforcement learning, enabling models to generalise intelligence rather than just memorise. This approach lowers training costs while enhancing reasoning abilities. He also discussed the growing trend of using reinforcement learning and self-play to make AI training more efficient and affordable.