Next Wednesday, join us in learning about Multimodalityβ with Llama 3.2. Llama 3.2 from Meta adds vision to our LLM application stack. What does this mean for AI Engineers and leaders?
β
We have questions:
How does multimodality actually work?
What are its limits today and what do we expect in the coming year?
When should we leverage multimodal models when building, shipping, and sharing?
Is Llama 3.2 ready for production? If so, what use cases?
Last week, we dove into the Agent Evaluation, uncovering best practices for assessing workflows like Topic Adherence, Tool Call Accuracy, and Agent Goal Accuracy. π
β οΈ Spoiler alert! It's not ready for prime time yet, and RAGAS is still developing synthetic test set generation tools. However, understanding how you'll likely combine agent-specific (e.g., tool-calling) evaluation tools based on LLM tracingwith standard LLM and RAG application evals.
That said, very simple agents can be evaluated. Check out what we know!
Join us to build, ship, and share an agentic application or two that can make a big impact with a small number of lines of code! We'll talkaboutagency levels, code agents, and framework comparisons. See you there!
We continue our discussion of Large Reasoning models with a deep dive into continuous chains of thought! The official repo was just released, so join us to learn about the tech and give it a test drive!
π‘ Transformation Spotlight:Xico Casillas! Follow his journey from conversational interface designer to leading his team's LLM and RAG app development. Read more!β