Hey, AIM community!

Next Wednesday, join us as we look into COCONUT: Chain of Continuous Thought. Following up on our recent LRMs event on DeepSeek-R1, we’ll continue exploring Chains of Thought (CoTs), but this time in latent space! We might even go beyond COCONUT to talk a bit about latent recurrence as well.

Last week, we explored Deepseek-R1! We covered a brief history of models from DeepSeek, then tied together important ideas ranging from CoT, to test-time compute, to process and outcome reward modeling, to RL versus RLHF, and more!

We learned that DeepSeek-R1 and DeepSeekMath use RL for real, in a way that lets the LLM "play" during training to figure out CoTs. Then, we can apply these directly to domains beyond math and code, in contrast to typical RLHF approaches that fix outcomes based on specific human feedback.

PS ... s/o to the guys at Unsloth for their "Train your own reasoning model" blog!

🧰 Resources

🧑‍🏫 Concepts: Slides
🧑‍💻 Code: AI Makerspace - Unsloth GRPO Training
📜 Paper: DeepSeek-R1
🏞️ Trailheads: The Illustrated DeepSeek-R1, DeepSeekMath

🔭 Coming Up!

Cursor: An AI Engineer’s Guide

Join us live for a 1-hour dive into how to set up a proper 2025 development environment that the best AI Engineers use, and build, ship, and share your very first LLM application with these new tools! If you're not sure where to start with AIE, start here.

RSVP

PydanticAI: From Data Validation to Agents

Question: "Is it true that just because my team is good at using Pydantic that I should consider using PydanticAI for agents?" It is built by the same team, after all. We've been asked this enough by our community and in our course - let's find out together, live!

RSVP

🌐 Around the Community!

💡 Transformation Spotlight: Allan Tan! A serial techno-preneur and founder of Predictive Systems, Inc., learn why he still loves to code and what he's working on at the LLM Edge.