πŸ—πŸš’πŸš€ Next-Level RLHF


​

Hey, AIM community! Here's a quick recap of week 44 at AI Makerspace.

TL;DR


​

​

🏫 Weekly Concepts and Code!

πŸ‘©β€βš–οΈπŸ‘¨πŸ»β€βš–οΈ Mixture of Judges: Next-Level RLHF​

We explored Meta's "Perfect Blend" paper, which "redefines RLHF" through the innovative Mixture-of-Judges (MoJ) approach. Participants learned how the new Constrained Generative Policy Optimization (CGPO) technique attempts to improve on classic Proximal Policy Optimization (PPO) and Direct Preference Optimization (DPO), helping to tackle reward hacking and optimize multi-task objectives. We broke down key concepts, from 'OG RLHF, RLAIF, and DPO, while considering new "Mixture of" approaches like Mixture of Experts (MoE) and Mixture of Agents (MoA) to try and understand what's going on that's new and novel. The code is still in the works so it wasn't yet available to fully demo, but we showed the current status via PRs and discussed the host of brand-new concepts and acronyms!

​

🧰 Resources


🌐 Around the Community!

​

πŸ’‘ Transformation Spotlight

  • ​Pano Evangeliou, a Sr. AI Engineer and AI Tech Lead, went from working in computational mechanics to AI Engineering. Learn how he developed the portfolio and skills he needed, from The Netherlands, without a software engineering background! He has some advice if you're looking to do the same!
video preview​

🀩 The AI Engineering Bootcamp, Cohort 4 Demo Day!

video preview​

🌍 Check out what the AIM community is building, shipping, and sharing!


πŸ₯³ Upcoming Events!

We'll be live in Austin, TX, next week! Join the AIM team in person at MLOps World 2024! ​

Go zero to agentic hero and start building production-ready multi-agent systems in just 3 hours. πŸ”§ πŸ’» Workshop deets here. Ticket discounts for AIM community members here.
​

πŸ§‘β€πŸ’» Join us live on YouTube every Wednesday at 10 AM PT for more concepts and code!

Inference & GPU Optimization: VPTQ

In part 3 of our series on Inference & GPU Optimization, we cover even lower-bit quantization than the previous two methods, GPTQ and AWQ. Let's quantize with all the methods and see what happens.

Teaching LLMs to Use Computers

Did you see that LLMs like Claude can use computers yet? We want to check out the capability for ourselves. We'll cover how this works under the hood and how, today, we're doing our best to evaluate these AI systems.


πŸ–ΌοΈ Meme of the Week


Applications are open for LLM Engineering: The Foundations Cohort 3 and The AI Engineering Bootcamp, Cohort 5 (read Richard's Cohort 4 review)!
​
Learn concepts and code free on YouTube and the Awesome AIM Index.

​

Keep building πŸ—οΈ, shipping 🚒, and sharing πŸš€, with a community!

​

​Dr. Greg, The Wiz, Seraacha, and Lusk​
​AI Makerspace​

​
​Unsubscribe Β· Preferences​

The LLM Edge

Read more from The LLM Edge

Hey, AIM community! Join Dr. Greg and The Wiz as they cover Large Reasoning Models next Wednesday, Jan 18. With the release of o1, o3, and the new Gemini, everyone is talking about Chain-of-Thought Reasoning and Test-Time Compute. What are these things, anyway? And what are the implications for building production LLM applications with models like this in 2025 and beyond? Join the discussion live on Wed. at 10 AM PT! Last week, we dove into the latest RAG Evaluation metrics and RAGAS...

πŸ‘‹ Hey, AIM community! As we near the end of 2024, our team is looking back at all we've accomplished as a community this year. Thanks to all of you for learning πŸ“š, building πŸ—, shipping 🚒, and sharing πŸš€ with us at the open-source LLM Edge! We'll be rooting for you to take your AI career to the next level in 2025, and when you do, we hope you'll lean on us to amplify your story and showcase your best work. In this way, you'll help the AI Makerspace community achieve its mission of becoming the...

πŸ‘‹ Hey, AIM community! Dr. Greg and the Wiz will go on-prem with LangGraph next week! Join us for our last YouTube Live event before the New Year πŸŽ†! Last Wednesday, Dr. Greg and The Wiz guest spoke with Malikeh from Arcee on the SLM Show about the year in summary at the LLM Edge, and what to expect in 2025! We also explored vLLM! We learned that Virtual LLM helps us relieve memory bottlenecks when serving LLMs through PagedAttention, just like Virtual Memory relieves memory bottlenecks in...