profile

The LLM Edge

Optimization of LLMs
Featured Post

πŸ—οΈπŸš’πŸš€ PydanticAI

Hey, AIM community! Next Wednesday, we begin a new series on Optimization of LLMs! We'll tackle an important topic from first principles: building and optimizing LLMs before they make it to production. What are the essential concepts and code that underlie the technology, from loss functions and gradient descent to LSTMs, RLHF, and GRPO? Join us to kick off our new series - which we will continue monthly - about Optimization of LLMs. Last week, we put PydanticAI to the test! πŸš€ The team behind...

Cursor: An AI Engineer’s Guide to Vibe Coding and Beyond

Hey, AIM community! Next Wednesday, we cover a new agent orchestration framework: PydanticAI. The team "built PydanticAI to bring that FastAPI feeling to Gen AI app development" because everything else out there wasn't good enough. Join Dr. Greg and The Wiz to help us assess whether or not they're accomplishing their mission as we learn to build, ship, and share a multi-agent application. Last week, we explored Cursor An AI Engineer’s Guide to Vibe Coding and Beyond. In 2025, top engineers in...

Reasoning in Continuous Latent Space: COCONUT & Recurrent Depth Approaches

Hey, AIM community! Next Wednesday, join us as we dive into Cursor: An AI Engineer’s Guide! We'll show you how to set up a dev environment that aligns with what some of the best AI Engineers use in the daily workflows to build, ship, and share LLM applications. We'll even introduce you to the future: 🎸 Vibe Coding - a.k.a. coding in mostly natural language! Last week, we explored Reasoning in Continuous Latent Space, including COCONUT and a deeper, even more recent, "Recurrent Depth"...

Deepseek-R1 & Training Your Own Reasoning Model

Hey, AIM community! Next Wednesday, join us as we look into COCONUT: Chain of Continuous Thought. Following up on our recent LRMs event on DeepSeek-R1, we’ll continue exploring Chains of Thought (CoTs), but this time in latent space! We might even go beyond COCONUT to talk a bit about latent recurrence as well. Last week, we explored Deepseek-R1! We covered a brief history of models from DeepSeek, then tied together important ideas ranging from CoT, to test-time compute, to process and...

smolagents and Open-source DeepResearch

Hey, AIM community! Next Wednesday, join us to tackle DeepSeek-R1! We'll also train our own reasoning model with Unsloth while we're at it! Additionally, we'll dig into what we know about how the model was trained, and how it was used to distill Qwen and Llama models. Last week, we explored Hugging Face's smolagents library and we went Deep on their new open-source DeepResearch implementation. What makes the library "smol?" How is it better than competitors? How did the HF team recreate...

Multimodal Vision Language Models (VLMs) and Complex Document RAG with Llama 3.2

Hey, AIM community! Next Wednesday, join us in learning about smolagents and how you can use the new framework to make big-impact agent applications with a small number of lines of code! Last week, we explored Multimodality with Llama 3.2, Meta’s first multimodal Llama model! We talked about the genesis of Vision Language Models (VLMs), and we even combined two VLMs to complete complex document parsing (with one VLM) and understanding (with Llama 3.2!). Watch the entire event for a primer on...

Agent Evaluation with Langchain

Hey, AIM community! Next Wednesday, join us in learning about Multimodality with Llama 3.2. Llama 3.2 from Meta adds vision to our LLM application stack. What does this mean for AI Engineers and leaders? We have questions: How does multimodality actually work? What are its limits today and what do we expect in the coming year? When should we leverage multimodal models when building, shipping, and sharing? Is Llama 3.2 ready for production? If so, what use cases? Join us live to find out! Last...

Large Reasoning Models

Hey, AIM community! Join Dr. Greg and The Wiz as they cover Agent Evaluation next Wednesday, January 22! Have you seen these new agent evaluation metrics like topic adherence, tool call accuracy, and agent goal accuracy? They seem like something we should all know about in 2025! Join us live next week to break down when we should use them and how! Last week, we dove into Large Reasoning Models (LRMs) (like OpenAI’s o1) designed to β€œthink through step-by-step" before they answer. We had a...

RAG Evaluation with RAGAS

Hey, AIM community! Join Dr. Greg and The Wiz as they cover Large Reasoning Models next Wednesday, Jan 18. With the release of o1, o3, and the new Gemini, everyone is talking about Chain-of-Thought Reasoning and Test-Time Compute. What are these things, anyway? And what are the implications for building production LLM applications with models like this in 2025 and beyond? Join the discussion live on Wed. at 10 AM PT! Last week, we dove into the latest RAG Evaluation metrics and RAGAS...

 What will you learn next in 2025?

Hey, AIM community! - Happy New Year! πŸ₯³ We'll kickoff with our first live YouTube event with Dr. Greg and The Wiz, who will show you what to evaluate (and how!) when building, shipping, and sharing production RAG applications in 2025! πŸ”­ Coming Up! RAG Evaluation with RAGAS If you're unfamiliar with assessing RAG applications, 2025 is the time to learn best-practice metrics! For those already in the know, it's important to keep up to date with the latest evolution, from Noise Sensitivity to...