Hey, AIM community!

Next Wednesday, join us in learning about Multimodality with Llama 3.2. Llama 3.2 from Meta adds vision to our LLM application stack. What does this mean for AI Engineers and leaders?

We have questions:

How does multimodality actually work?
What are its limits today and what do we expect in the coming year?
When should we leverage multimodal models when building, shipping, and sharing?
Is Llama 3.2 ready for production? If so, what use cases?

Join us live to find out!

Last week, we dove into the Agent Evaluation, uncovering best practices for assessing workflows like Topic Adherence, Tool Call Accuracy, and Agent Goal Accuracy. 📊

⚠️ Spoiler alert! It's not ready for prime time yet, and RAGAS is still developing synthetic test set generation tools. However, understanding how you'll likely combine agent-specific (e.g., tool-calling) evaluation tools based on LLM tracing with standard LLM and RAG application evals.

That said, very simple agents can be evaluated. Check out what we know!

🧰 Resources

🧑‍🏫 Concepts: Slides
🧑‍💻 Code: Evaluating Agents with Ragas (2025) - AI Makerspace.ipynb

🔭 Coming Up!

smolagents: Small Agents?

Join us to build, ship, and share an agentic application or two that can make a big impact with a small number of lines of code! We'll talk about agency levels, code agents, and framework comparisons. See you there!

RSVP

COCONUT: Chain of Continuous Thought

We continue our discussion of Large Reasoning models with a deep dive into continuous chains of thought! The official repo was just released, so join us to learn about the tech and give it a test drive!

RSVP

🌐 Around the Community!

💡 Transformation Spotlight: Xico Casillas! Follow his journey from conversational interface designer to leading his team's LLM and RAG app development. Read more!

🤓 See what the community is building, shipping, and sharing this week. Join us in the Lounge every Monday at 9 AM PT for some accountability!

@kennyrogers.btc stood out amongst cohort 5 by building a RAG app w/ a custom FastAPI backend and ReAct front end
@Raj built a browser automation agent and a computer use gent from scratch
@AI_by_AI analyzed some live sportscar racing with his avatars this week
s/o to Hugh for spreading the good word of building, shipping, and sharing!
@Smail AI built an AI voice "GateKeeper"
@Nicoly dropped another pod w/ the creator of sqlite-vec
@SUBHAM built a rag app using the new Anthropic citations feature
@FelixUltima repping his learning journey with spaced repetition
Check out the build-ship-share-🏗-🚢-🚀 channel to see all the great work by folks in cohort 5. s/o to @kennyrogers.btc and @angrez for building in public!

Want to join the AIM community? Hop into Discord and share your intro!

🖼️ Meme of the Week

🌟 Want to start building, shipping, and sharing but not sure how? Check out our LLM Foundations - a 5-day email-based course to start learning and :bss:'ing today.

Keep building 🏗️ shipping 🚢 and sharing 🚀,

Dr. Greg, The Wiz, Seraacha, and Lusk
AI Makerspace

Unsubscribe · Preferences

🏗🚢🚀 Agent Evaluation

🔭 Coming Up!

smolagents: Small Agents?

COCONUT: Chain of Continuous Thought

🌐 Around the Community!

🖼️ Meme of the Week