Hey, AIM community!

On Wednesday, we'll cover the infra stack that we recommend for RAG in 2025. Then, we'll build, ship, and share a best-practice RAG app.

We'll also discuss important production tradeoffs and implications that you should consider before and after deployment when going from zero to production RAG!

Last week, we discussed the latest open-source repo drops from DeepSeek Week, and we covered how they're being used as a new best-practice way to do inference on MoE models via Hopper (e.g., H100, H200, etc.) GPUs! vLLM has already implemented much of the inference system, as we saw during the event! Learn how cross-node Expert Parallelism drove the solution to the problem the DeepSeek team sought to solve of higher throughput and lower latency, resulting in decreased cost and high theoretical income.

🧰 Resources

🧑‍🏫 Concepts: DeepSeek Week
🧑‍💻 Code: How we optimized vLLM for DeepSeek-R1 & GPU Glossary.

This is why DeepSeek’s inference system has been called “The North Star for LLM Inference” by vLLM

🔭 Coming Up!

Enterprise Agents with OpenAI

What does the agents SDK look like from OpenAI? How does it build on previous work they've done? Are they officially in the end-to-end platform game competing with orchestration frameworks like LangChain, LlamaIndex, CrewAI, and others? Join us live to find out!

RSVP

Model Context Protocol

What is MCP, exactly? What can we learn about looking at the protocol as defined? What can we learn from tracing its impact since rits elease over the past 5 months? Released by Anthropic in November 2024, MCP has caught on as a new gold standard for the way that context is shared between models. Let's learn it, then build, ship, and share with it!

RSVP

🌐 Around the Community!

💡 Transformation Spotlight: Tyler Laughlin, an Information Systems Engineer at Adobe. He has some great advice for enterprise-level engineers on how they should be using Gen AI. Read more about his story!

🤓 See what the community is 📹 building, shipping, and sharing this week. Join us in the Lounge every Monday at 9 AM PT for some accountability!

Thanks to everyone who joined us for The AI Engineering Bootcamp, Cohort 5 Demo Day! Watch YouTube for presentation releases this week!
Piotr reminds us of the importance of putting humans in the loop
AI_by_AI walks us through OpenAI.fm - the new text-to-speech models and playground!
Ben keeps building prompt-to-VR worlds, even after Demo Day!
Null dropped TrashLens to bring order to the chaos of your phone gallery.
Raul dropped new features on his Multi-Agent AI Chatbot system #buildinpublic
The Wiz breaks down the new family of Llama Nemotron models from NVIDIA
Apparently the meme turing test has been passed.
@MikeC shares the Week 13 events you need to know about!

Want to join the AIM community? Hop into Discord and share your intro!

🖼️ Meme of the Week

🌟 Want to start building, shipping, and sharing? Check out LLM Foundations - an email-based course or our newly open-sourced full LLM Engineering, Cohort 3 course!

Keep building 🏗️ shipping 🚢 and sharing 🚀,

Dr. Greg, The Wiz, Seraacha, and Lusk
AI Makerspace

Unsubscribe · Preferences

🏗️ 🚢 🚀 The DeepSeek Inference System

🔭 Coming Up!

Enterprise Agents with OpenAI

Model Context Protocol

🌐 Around the Community!

🖼️ Meme of the Week