About this episode
Podcast: Complex Systems with Patrick McKenzie (patio11) (LS 45 · TOP 1% what is this?)Episode: Inference engineering and the real-world deployment of LLMs, with Philip KielyPub date: 2026-03-12Get Podcast Transcript ?powered by Listen411 - fast audio-to-text and summarizationPatrick McKenzie (patio11) and Philip Kiely, early employee at Baseten, discuss the inference stack: the critical layer of software and hardware that sits between a model’s weights and a user’s prompt. They cover inference engineering, how intermediate layers are evolving over a technical stack that is changing every six months, and how sophisticated organizations are actually consuming LLMs beyond just writing their questions into chatbot apps.–Full transcript available here: www.complexsystemspodcast.com/inference-engineering-with-philip-kiely/–Presenting Sponsors: Mercury, Meter, & GranolaComplex Systems is presented by Mercury—radically better banking for founders. Mercury offers the best wire experience anywhere: fast, reliable, and free for domestic U.S. wires, so you can stay focused on growing your business. Apply online in minutes at mercury.com.Networking infrastructure has a way of accumulating technical debt faster than almost anything else in IT. Meter handles the full stack (wired, wireless, and cellular) as a single integrated solution: designed, deployed, and managed end-to-end so there's only one vendor to call when something goes wrong. Visit meter.com/complexsystems to book a demo. If meetings consistently leave you with hazy action items and lost context, Granola handles the transcription so you can actually participate and gives you searchable notes afterward. Try it free at