profile

The LLM Edge

Featured Post

πŸ—πŸš’πŸš€ RAG Evaluation

Hey, AIM community! Join Dr. Greg and The Wiz as they cover Large Reasoning Models next Wednesday, Jan 18. With the release of o1, o3, and the new Gemini, everyone is talking about Chain-of-Thought Reasoning and Test-Time Compute. What are these things, anyway? And what are the implications for building production LLM applications with models like this in 2025 and beyond? Join the discussion live on Wed. at 10 AM PT! Last week, we dove into the latest RAG Evaluation metrics and RAGAS...

πŸ‘‹ Hey, AIM community! As we near the end of 2024, our team is looking back at all we've accomplished as a community this year. Thanks to all of you for learning πŸ“š, building πŸ—, shipping 🚒, and sharing πŸš€ with us at the open-source LLM Edge! We'll be rooting for you to take your AI career to the next level in 2025, and when you do, we hope you'll lean on us to amplify your story and showcase your best work. In this way, you'll help the AI Makerspace community achieve its mission of becoming the...

πŸ‘‹ Hey, AIM community! Dr. Greg and the Wiz will go on-prem with LangGraph next week! Join us for our last YouTube Live event before the New Year πŸŽ†! Last Wednesday, Dr. Greg and The Wiz guest spoke with Malikeh from Arcee on the SLM Show about the year in summary at the LLM Edge, and what to expect in 2025! We also explored vLLM! We learned that Virtual LLM helps us relieve memory bottlenecks when serving LLMs through PagedAttention, just like Virtual Memory relieves memory bottlenecks in...

πŸ‘‹ Hey, AIM community! Dr. Greg and the Wiz will unlock vLLM for you next week with a full breakdown of "Easy, fast, and cheap LLM serving for everyone." Last Wednesday, we explored AG2: AutoGen, Evolved with co-creator Qingyun Wu. The origin story was fascinating - from MathChat to going viral! AutoGen is all about conversations - which effectively constitute reasoning - by going full send on messages. 🧰 Resources πŸ§‘πŸ« Concepts: Slides πŸ§‘πŸ’» Code: CaptainAgent Notebook πŸ“œ Paper: AutoGen The AutoGen...

πŸ‘‹ Hey, AIM community! Join Dr. Greg, The Wiz, and the creators of AutoGen next Wednesday for AG2: AutoGen, Evolved! They just dropped new features and a new website. Join us to hear the latest! Last Wednesday, AI Makerspace explored On-Prem Agentic RAG: Report Generation with LlamaIndex! We dug into what "on-prem" means, exactly, how dependency hell is extra real on-prem, and how there are unique challenges you run into when operating at the LLM edge. 🧰 Resources πŸ§‘πŸ« Concepts: Slides πŸ§‘πŸ’» Code:...

πŸ‘‹ Hey, AIM community! Next Wednesday, Dr. Greg & The Wiz πŸͺ„ will explore the concepts and code behind On-Prem Agentic RAG! Last Wednesday, they explored FA2: Next-Level Attention. They dug all the way down into the "shadow of the warp groups" on GPU hardware. It was epic. S/o to @Allan Tan with the awesome community recap. 🧰 Resources πŸ§‘πŸ« Concepts: Slides πŸ§‘πŸ’» Code: Flash Attention - AIM Event πŸ“œ Papers: FA, FA2, FA3 πŸ”­ Coming Up! AG2: AutoGen, Evolved December 4, 2024 The co-creators of AutoGen...

πŸ‘‹ Hey, AIM community! Next week, Dr. Greg & The Wiz will explore the concepts and code behind calculating attention in practice with FA2: Next-Level Attention. FA2 = Flash Attention 2 Last week, they tested Claude's Computer Use. It turns out, that LLMs seriously can drive your computer now! This is super exciting, but also potentially really scary. As a friendly reminder, never give an LLM access to YOUR computer, only access to A (preferably, virtual) computer. 🧰 Resources πŸ§‘πŸ« Concepts:...

Hey, AIM community! Here's a quick recap of week 44 at AI Makerspace. TL;DR βš›οΈ Learn about πŸ‘©βš–οΈπŸ‘¨βš–οΈ Mixture of Judges: Next-Level RLHF πŸ“š Learning, building, shipping, and sharing with the AIM Community! πŸ’‘ Transformation spotlight: Pano Evangeliou 🏫 1-minute lesson: What is the β€œGolden Chunk” in RAG? πŸ€“ See what folks are building, shipping, and sharing this week πŸ“š LLM Engineering detailed schedule. Take the LLME challenge! ⏭️ Join us live next week! RSVP here: Inference & GPU Optimization: VPTQ...

TL;DR Welcome, LLM practitioner! Here's a quick recap of week 40 at AI Makerspace! βš›οΈ Learn about 🐝 Swarm: Multi-Agent Orchestration πŸ“š Learning, building, shipping, and sharing with the AIM Community! πŸ’‘ Transformation spotlight: Nitin Gupta 🏫 1-minute lesson: Basic RAG Retrieval πŸ€“ See what folks are building, shipping, and sharing this week ⛰️ AI Summits of San Francisco, Week 44, by Mike Chrabaszcz πŸ“š Cohort 3 of LLM Engineering: The Foundations kicks off Nov. 14 ⏭️ Join us live next week!...

TL;DR Welcome, LLM practitioner! Here's a quick recap of week 40 at AI Makerspace! βš›οΈ Learn about Contextual-Retrieval. Hint: optimizing RAG is about more than chunk sizes. πŸ“š Learning, building, shipping, and sharing with the AIM Community! πŸ’‘ Transformation spotlight: Mert Bozkir 🏫 1-minute lesson: How GPTQ minimizes the quantization error πŸ¦™ AIM was building, shipping, and sharing at the RAG-a-thon! πŸ§‘πŸ« Check out our playlist to Learn RAG & Agents! πŸ“š Cohort 3 of LLM Engineering: The...