About this episode
The two models that you will hear discussed for at least the next two months - Claude Opus 4.6 and GPT 5.3 Codex - just got released within 26 mins or each other. The full breakdown of around 250 pages of reports, with just the most interest moments, from the battle of which is best, Claude personhood, the surprising misbehaviour of Opus 4.6, and much morehttps://assemblyai.com/aiexplainedCheck out my fast-growing (!) app, free to use, and code INSIDER15 for Pro: https://lmcouncil.aiAI Insiders ($9): https://www.patreon.com/AIExplainedChapters:00:00 - Introduction00:54 - Self-improvement?02:44 - Knowledge Work05:30 - Overly agentic behaviour09:12 - Who Shouldn’t Use Claude Opus11:39 - Step-change?15:09 - Claude’s ‘Personhood’Hassabis Roadmap: https://www.patreon.com/posts/hassabis-roadmap-149750869Release of Opus 4.6: https://www.anthropic.com/news/claude-opus-4-6212 Page System Card: https://www-cdn.anthropic.com/0dd865075ad3132672ee0ab40b05a53f14cf5288.pdfClaude Code Tip: https://x.com/bcherny/status/2019475897691124107GPT Codex 5.3: https://openai.com/index/introducing-gpt-5-3-codex/System Card: https://openai.com/index/gpt-5-3-codex-system-card/Browse Comp: https://arxiv.org/pdf/2504.12516v1Finance Agent: https://www.vals.ai/benchmarks/finance_agentTerminal Bench 2: https://arxiv.org/pdf/2601.11868Vending Bench: https://andonlabs.com/blog/opus-4-6-vending-benchMy X post: https://x.com/AIExplainedYT/status/2016851303436095647Anthropic Apology: https://x.com/ch402/status/2014066134194995256/photo/1Altman rebuttal: https://x.com/sama/status/2019139174339928189https://x.com/sama/status/20191402762464420894% of GitHub: https://x.com/dylan522p/status/2019490550911766763Non-hype Newsletter: https://signaltonoise.beehiiv.com/Podcast: https://aiexplainedopodcast.buzzsprout.com/