AI Daily Brief — 29 September 2025
Monday compressed four landmark stories into one news cycle. Anthropic shipped Claude Sonnet 4.5 with state-of-the-art SWE-bench Verified (77.2%) and OSWorld (61.4% vs 42.2% for Sonnet 4), 30+ hour autonomous focus, native VS Code extension, and an open-sourced Claude Agent SDK — alongside Claude Code 2.0 with auto-saved checkpoints (Esc-Esc rewind), parallel sub-agents, and a beta memory tool with context-editing (39% gain on agentic search, 84% token reduction in 100-turn web search). OpenAI launched Instant Checkout in ChatGPT with Etsy live and Shopify imminent (Glossier, SKIMS, Vuori, Spanx queued) — built on the new Agentic Commerce Protocol co-developed with Stripe; Etsy stock popped 16%. DeepSeek released V3.2-Exp with Sparse Attention, cutting API prices 50%+ (output now $0.42/M tokens vs $1.68 under V3.1-Terminus). Governor Newsom signed SB 53, California’s Transparency in Frontier AI Act — the first US state-level frontier-AI safety law.
Top stories
- Anthropic ships Claude Sonnet 4.5. “World’s best coding model” — SWE-bench Verified 77.2%, OSWorld 61.4% (vs 42.2% for Sonnet 4), 30+ hour multi-step focus, unchanged $3/$15 per Mtoken. via Anthropic
- Claude Code 2.0 with checkpoints and VS Code extension. Auto-saved checkpoints (Esc-Esc or /rewind, 30-day retention), refreshed terminal UI, parallel sub-agents, hooks, native VS Code extension in beta with sidebar inline diffs. via Anthropic
- Anthropic memory beta + context-editing API. Beta memory tool persisting state across sessions via client-side /memories directory; context-editing auto-clears stale tool results. Combined: 39% gain on agentic search, 84% token reduction in a 100-turn web search eval. via Anthropic
- Claude Agent SDK open-sourced. Same infrastructure powering Claude Code, now Python (claude-agent-sdk on PyPI) and TypeScript (@anthropic-ai/claude-agent-sdk on npm) — file editing, bash, web search/fetch, sub-agents, persistent sessions, first-class MCP client support. via Anthropic
- OpenAI launches Instant Checkout in ChatGPT. US ChatGPT users buy products inside chat; Etsy sellers live, Shopify (Glossier, SKIMS, Spanx, Vuori) imminent. Built on the new Agentic Commerce Protocol co-developed and open-sourced with Stripe. Etsy stock popped 16%. via CNBC
- DeepSeek releases V3.2-Exp with Sparse Attention. MIT-licensed, builds on V3.1-Terminus, debuts DSA — fine-grained sparse attention via lightning indexer + fine-grained token selection. Faster long-context training/inference at parity quality. Live on app, web and API. via DeepSeek
- DeepSeek slashes API prices 50%+. Input cache hits to $0.028/M, cache misses to $0.28/M, outputs to $0.42/M (down from $0.07 / $0.56 / $1.68 under V3.1-Terminus). via VentureBeat
- Newsom signs SB 53 — Transparency in Frontier AI Act. California first US state to directly regulate frontier-model developers (>$500M revenue, models trained above 10^26 FLOPs). Required public safety frameworks, transparency reports, incident reporting to CalOES, whistleblower protections, CalCompute public cluster consortium. AG can pursue civil penalties up to $1M per violation. Effective Jan 1, 2026. via Office of Governor Newsom
- SB 53: Anthropic endorses, OpenAI and Meta lobbied against. Anthropic co-founder Jack Clark called the bill “a solid blueprint for AI governance.” OpenAI’s Chris Lehane said “America leads best with clear, nationwide rules, not a patchwork of state or local regulations.” Meta also lobbied against. via Anthropic
Who shipped
Anthropic shipped Sonnet 4.5, Claude Code 2.0, memory beta, context-editing API, Agent SDK and the “Imagine with Claude” research preview (5-day Max-tier preview generating software on the fly with no pre-written code). OpenAI shipped Instant Checkout with Stripe-built Agentic Commerce Protocol. DeepSeek shipped V3.2-Exp with TileLang + CUDA/FlashMLA open kernels (vLLM Day-0 support). Google DeepMind published NeurIPS-accepted ReStraV detector for AI-generated videos.
Open-source pulse
DeepSeek dominated open-source Monday. V3.2-Exp weights on Hugging Face MIT, plus open kernels in TileLang and CUDA/FlashMLA, plus immediate vLLM Day-0 support. The 50%+ API price cut reframes the open-weights long-context economics overnight.
Money, infra & hardware
EPA Administrator Lee Zeldin announced TSCA chemical-review prioritization for AI data-center projects (>$500M capex or >100 MW incremental load) starting with submissions received on or after Sep 29 — authorized by Trump EO 14318. No frontier funding rounds Monday.
Quiet corners
Anthropic published the Sonnet 4.5 system card detailing comprehensive alignment evaluations, ASL-3 protections, and reductions in sycophancy, deception and power-seeking — described as Anthropic’s most aligned frontier model to date. Google DeepMind’s ReStraV NeurIPS poster on perceptual-straightening video detection landed alongside.
By the numbers
- 77.2% — Sonnet 4.5 SWE-bench Verified
- 61.4% — Sonnet 4.5 OSWorld (vs 42.2% for Sonnet 4 four months prior)
- 30+ hours — Sonnet 4.5 sustained multi-step focus
- 50%+ — DeepSeek API price cut from V3.2-Exp Sparse Attention
- 16% — Etsy stock pop on Instant Checkout news
- 10^26 FLOPs / $1M — SB 53 threshold / max per-violation penalty
- Most-mentioned company: Anthropic
- Quietest segment: frontier funding rounds
Compiled by AI Feed’s editor from verified web sources for 29 September 2025.