Hardening Agents for E-commerce Scale: From RL Alignment to Reliability // Panel 2

About this episode

Thanks to Prosus Group for collaborating on the Agents in Production Virtual Conference 2025.Abstract //The discussion centers on highly technical yet practical themes, such as the use of advanced post-training techniques like Direct Preference Optimization (DPO) and Parameter-Efficient Fine-Tuning (PEFT) to ensure LLMs maintain stability while specializing for e-commerce domains. We compare the implementation challenges of Computer-Using Agents in automating legacy enterprise systems versus the stability issues faced by conversational agents when inputs become unpredictable in production. We will analyze the role of cloud infrastructure in supporting the continuous, iterative training loops required by Reinforcement Learning-based agents for e-commerce!Bio // Paul van der Boor (Panel Host) //Paul van der Boor is a Senior Director of Data Science at Prosus and a member of its internal AI group.Arushi Jain (Panelist) // Arushi is a Senior Applied Scientist at Microsoft, working on LLM post-training for Computer-Using Agent (CUA) through Reinforcement Learning. She previously completed Microsoft’s competitive 2-year AI Rotational Program (MAIDAP), building and shipping AI-powered features across four product teams.She holds a Master’s in Machine Learning from the University of Michigan, Ann Arbor, and a Dual Degree in Economics from IIT Kanpur. At Michigan, she led the NLG efforts for the Alexa Prize Team, securing a $250K research grant to develop a personalized, active-listening socialbot. Her research spans collaborations with Rutgers School of Information, Virginia Tech’s Economics Department, and UCLA’s Center for Digital Behavior.Beyond her technical work, Arushi is a passionate advocate for gender equity in AI. She leads the Women in Data Science (WiDS) Cambridge community, scaling participation in her ML workshops from 25 women in 2020 to 100+ in 2025—empowering women and non-binary technologists through education and mentorship.Swati Bhatia //Passionate about building and investing in cutting-edge technology to drive positive impact.Currently shaping the future of AI/ML at Google Cloud.10+ years of global experience across the U.S, EMEA, and India in product, strategy & venture capital (Google, Uber, BCG, Morpheus Ventures).Audi Liu //I’m passionate about making AI more useful and safe.Why? Because AI will be ubiquitous in every workflow, powering our lives just like how electricity revolutionized our society - It’s pivotal we get it right.At Inworld AI, we believe all future software will be powered by voice. As a Sr Product Manager at Inworld, I'm focused on building a real-time voice API that empowers developers to