πŸ—πŸš’πŸš€ Reasoning and Test-Time Compute


​

Hey, AIM community!

​

Join Dr. Greg and The Wiz as they cover Agent Evaluation next Wednesday, January 22!

​
Have you seen these new agent evaluation metrics like topic adherence, tool call accuracy, and agent goal accuracy? They seem like something we should all know about in 2025! Join us live next week to break down when we should use them and how!

​


Last week, we dove into Large Reasoning Models (LRMs) (like OpenAI’s o1) designed to β€œthink through step-by-step" before they answer. We had a wide-ranging discussion from Chain-of-Thought basics to what's been happening in the space (including the language space and latent space) of reasoning since last month! We had a blast, and look forward to digging deeper into process reward modeling and much more reasoning soon!

🧰 Resources


πŸ”­ Coming Up!

Multimodality with Llama 3.2

Llama 3.2 from Meta adds vision to our tool stack in a way that finally feels compelling and useful. What are the limits of multimodal models today? How do they actually work? Are they ready for production use cases on complex data that includes text and figures? Let's find out!

smolagents: Small Agents?

​

Explore Hugging Face’s smolagents, a sleek new library for building agents across multiple agency levels. Join us live to dive into its features and compare it to leading frameworks like LangChain, LlamaIndex, AG2, and others. Join us live to :bss: with smolagents!


🌐 Around the Community!

πŸ’‘ Transformation Spotlight: Debora Andrade, Learn how she went from a postdoctoral researcher, educated in Physics, to a Generative AI consultant and entrepreneur. Upskilling in software engineering and coding was critical to her journey!​

video preview​

πŸ€“ See what the community is building, shipping, and sharing this week. Join us in the Lounge every Monday at 9 AM PT for some accountability!

​

Want to join the AIM community? Hop into Discord and share your intro!


​

πŸ–ΌοΈ Meme of the Week


🌟 Want to start building, shipping, and sharing but not sure how? Check out our LLM Foundations - a 5-day email-based course to start learning and :bss:'ing today.

​

Keep building πŸ—οΈ shipping 🚒 and sharing πŸš€,

​

​Dr. Greg, The Wiz, Seraacha, and Lusk​
​AI Makerspace​

​
​Unsubscribe Β· Preferences​

The LLM Edge

Read more from The LLM Edge
RAG: The 2025 Best Practice Stack

Hey, AIM community! Tomorrow, we'll cover Enterprise Agents with OpenAI! What does the agents SDK look like from OpenAI? How does it build on previous work they've done? Are they officially in the end-to-end platform game competing with orchestration frameworks like LangChain, LlamaIndex, CrewAI, and others? Join us live to find out! Last week, we discussed RAG: The 2025 Best-Practice Stack This is the year of Practical RAG, and we kicked it off by unpacking the Minimum Viable...

DeepSeek Week

Hey, AIM community! On Wednesday, we'll cover the infra stack that we recommend for RAG in 2025. Then, we'll build, ship, and share a best-practice RAG app. We'll also discuss important production tradeoffs and implications that you should consider before and after deployment when going from zero to production RAG! Last week, we discussed the latest open-source repo drops from DeepSeek Week, and we covered how they're being used as a new best-practice way to do inference on MoE models via...

Optimization of LLMs

Hey, AIM community! Next Wednesday, we begin a new series on Optimization of LLMs! We'll tackle an important topic from first principles: building and optimizing LLMs before they make it to production. What are the essential concepts and code that underlie the technology, from loss functions and gradient descent to LSTMs, RLHF, and GRPO? Join us to kick off our new series - which we will continue monthly - about Optimization of LLMs. Last week, we put PydanticAI to the test! πŸš€ The team behind...