We explored Meta's "Perfect Blend" paper, which "redefines RLHF" through the innovative Mixture-of-Judges (MoJ) approach. Participants learned how the new Constrained Generative Policy Optimization (CGPO) technique attempts to improve on classic Proximal Policy Optimization (PPO) and Direct Preference Optimization (DPO), helping to tackle reward hacking and optimize multi-task objectives. We broke down key concepts, from 'OG RLHF, RLAIF, and DPO, while considering new "Mixture of" approaches like Mixture of Experts (MoE) and Mixture of Agents (MoA) to try and understand what's going on that's new and novel. The code is still in the works so it wasn't yet available to fully demo, but we showed the current status via PRs and discussed the host of brand-new concepts and acronyms!
βPano Evangeliou, a Sr. AI Engineer and AI Tech Lead, went from working in computational mechanics to AI Engineering. Learn how he developed the portfolio and skills he needed, from The Netherlands, without a software engineering background! He has some advice if you're looking to do the same!
We'll be live in Austin, TX, next week! Join the AIM team in person at MLOps World 2024! β
Go zero to agentic hero and start building production-ready multi-agent systems in just 3 hours. π§ π» Workshop deets here. Ticket discounts for AIM community members here. β
π§βπ» Join us live on YouTube every Wednesday at 10 AM PT for more concepts and code!
Inference & GPU Optimization: VPTQ
In part 3 of our series on Inference & GPU Optimization, we cover even lower-bit quantization than the previous two methods, GPTQ and AWQ. Let's quantize with all the methods and see what happens.
Did you see that LLMs like Claude can use computers yet? We want to check out the capability for ourselves. We'll cover how this works under the hood and how, today, we're doing our best to evaluate these AI systems.
Weekly Concepts, Code, and Community! Every Saturday you'll receive a detailed overview of one the latest tools or techniques from the open-source LLM edge.
Hey, AIM community! Join Dr. Greg and The Wiz as they cover Large Reasoning Models next Wednesday, Jan 18. With the release of o1, o3, and the new Gemini, everyone is talking about Chain-of-Thought Reasoning and Test-Time Compute. What are these things, anyway? And what are the implications for building production LLM applications with models like this in 2025 and beyond? Join the discussion live on Wed. at 10 AM PT! Last week, we dove into the latest RAG Evaluation metrics and RAGAS...
π Hey, AIM community! As we near the end of 2024, our team is looking back at all we've accomplished as a community this year. Thanks to all of you for learning π, building π, shipping π’, and sharing π with us at the open-source LLM Edge! We'll be rooting for you to take your AI career to the next level in 2025, and when you do, we hope you'll lean on us to amplify your story and showcase your best work. In this way, you'll help the AI Makerspace community achieve its mission of becoming the...
π Hey, AIM community! Dr. Greg and the Wiz will go on-prem with LangGraph next week! Join us for our last YouTube Live event before the New Year π! Last Wednesday, Dr. Greg and The Wiz guest spoke with Malikeh from Arcee on the SLM Show about the year in summary at the LLM Edge, and what to expect in 2025! We also explored vLLM! We learned that Virtual LLM helps us relieve memory bottlenecks when serving LLMs through PagedAttention, just like Virtual Memory relieves memory bottlenecks in...