Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that

Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that

18:26 Nov 14, 2025
About this episode
A lot just got released in the last 36 hours, and it will all affect hundreds of millions of people. 10 details you would miss if you just read the headlines, from GPT 5.1 regressions, to how Claude hacked Govt Agencies, to SIMA 2, and Musical Turing Tests.https://assemblyai.com/aiexplainedChapters:00:00 - Introduction00:56 - GPT 5.1 Smarter?01:47 - Some Regressions03:22 - Sycophancy?05:22 - Claude Auto-Hacking 06:16 - Jailbreaking through Granularity08:22 - This Will be Re-used09:30 - Hallucinating Hacker09:57 - Surprisingly Neutral Tone12:18 - SIMA 214:10 - Alpha Parallels17:24 - AI MusicGPT 5.1 Announcement: https://openai.com/index/gpt-5-1/System Card: https://cdn.openai.com/pdf/4173ec8d-1229-47db-96de-06d87147e07e/5_1_system_card.pdfBenchmarks: https://openai.com/index/gpt-5-1-for-developers/Simple Bench: https://lmcouncil.ai/benchmarksAuto-Hacking: https://x.com/AnthropicAI/status/1989033793190277618https://www.anthropic.com/news/disrupting-AI-espionageReport: https://assets.anthropic.com/m/ec212e6566a0d47/original/Disrupting-the-first-reported-AI-orchestrated-cyber-espionage-campaign.pdfSima 2 Announcement: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/https://x.com/amoufarek/status/1988986075331858693Scepticism: https://www.technologyreview.com/2025/11/13/1127921/google-deepmind-is-using-gemini-to-train-agents-inside-goat-simulator-3/Voyager: https://voyager.minedojo.org/Reuters Music: https://www.reuters.com/legal/litigation/are-you-listening-bots-survey-shows-ai-music-is-virtually-undetectable-2025-11-12/
Select an episode
0:00 0:00