Skip to content
Source · Daily Brief

AI Daily Brief — 22 May 2025

The biggest Anthropic day in the lab’s history. Opus 4 and Sonnet 4 set new coding SOTAs and ran sustained 7-hour agentic tasks. Claude Code went GA. The Responsible Scaling Policy hit ASL-3 for the first time. And Stargate UAE formally launched as the first overseas Stargate deployment.

Top stories

  • Anthropic launches Claude Opus 4 and Claude Sonnet 4. Opus 4 positioned as ‘world’s best coding model.’ Both are hybrid reasoning models with extended thinking. Pricing unchanged: Opus 4 at $15/$75 per M tokens, Sonnet 4 at $3/$15. Available on Anthropic API, Amazon Bedrock and Google Cloud Vertex AI. via Anthropic
  • Opus 4 sets SOTA on coding benchmarks. 72.5% on SWE-Bench Verified and 43.2% on Terminal-bench. Sonnet 4 at 72.7% on SWE-Bench. Anthropic showed Opus 4 working autonomously for ~7 hours on a Rakuten refactoring task — demonstrating sustained multi-hour agentic coding.
  • Claude Code goes generally available. After the February research preview, Claude Code reached GA with new Claude Code SDK, native VS Code and JetBrains integrations, and background tasks via GitHub Actions. Product later reached $1B ARR in ~6 months.
  • Extended thinking with tool use + interleaved thinking. Both Claude 4 models can call tools (e.g. web search) during extended thinking, alternating reasoning and tool calls. New ‘interleaved thinking’ lets the model think between tool calls; parallel tool use and improved memory (via local files) are now supported.
  • Anthropic ships four new API capabilities for agents. Code execution tool (Python sandbox), MCP connector (connect to any remote MCP server without writing client code), Files API, and prompt caching extended to 1 hour. All in public beta. via Anthropic
  • First-ever ASL-3 Deployment and Security Standards activation. Triggered by Opus 4’s improved performance on CBRN-related (especially biology) tasks. ASL-3 adds weight-theft hardening and narrowly targeted misuse safeguards. Claude Opus 4 system card published same day. via Anthropic
  • Stargate UAE officially announced in Abu Dhabi. G42, OpenAI, Oracle, NVIDIA, SoftBank and Cisco unveiled Stargate UAE — a 1GW compute cluster inside a 5GW UAE-US AI Campus. First international Stargate deployment and first project under ‘OpenAI for Countries.’ Built by G42, operated by OpenAI and Oracle. NVIDIA supplies GB300 systems; first 200MW slice live in 2026. via OpenAI
  • G42 chip-allocation framing — up to 500,000 advanced AI chips/year. Reporting put the underlying US-UAE accord at up to 500K of NVIDIA’s most advanced AI chips per year from 2025-2027 (~$15B potential), with ~one-fifth flowing directly to G42 and the rest to US hyperscalers in the Emirates. Site planned to ultimately house up to 2.5M B200-class GPUs. via CNBC
  • Anthropic’s first ‘Code with Claude’ developer conference. Inaugural one-day hands-on event in San Francisco featuring the Claude 4 launch, sessions on the Anthropic API, CLI tools and MCP, and office hours with technical teams.

Who shipped

Anthropic shipped the headline model, the developer surface (Claude Code GA), the API surface (4 new capabilities), the safety threshold (ASL-3) and the conference. OpenAI + G42 + Oracle + NVIDIA + SoftBank + Cisco shipped Stargate UAE.

Open-source pulse

The MCP connector for Anthropic’s API + Microsoft’s native MCP integration into Windows 11 from Build (same week) mark a coordinated step in standardizing agent-tool plumbing around MCP. Meta pushed updates to Llama 4 weights on Hugging Face following the Llama for Startups program launched May 21.

Money, infra & hardware

Stargate UAE is the largest concrete commitment to international Stargate to date. Wall Street previews flagged NVIDIA’s Q1 FY26 print scheduled for May 28 — would later show $44.1B revenue (+69% YoY) with a $4.5B H20 inventory charge.

Quiet corners

Google I/O 2025 content opened broadly on demand. Microsoft Build 2025 ran concurrent (May 19-22) with GitHub Copilot upgraded to an autonomous coding agent and MCP integrated natively into Windows 11.

By the numbers

  • 72.5% / 72.7% / 43.2% — Opus 4 / Sonnet 4 SWE-Bench Verified / Opus 4 Terminal-bench
  • $15 / $75 / $3 / $15 per M tokens — Opus 4 in/out, Sonnet 4 in/out
  • ~7 hours — Opus 4 autonomous Rakuten refactor
  • 4 — new Anthropic API capabilities (code exec, MCP connector, Files API, 1h prompt cache)
  • 1 GW / 200 MW / 2026 — Stargate UAE cluster / first slice / online target
  • 500,000 / ~$15B / 2.5M — annual NVIDIA chip allocation / value / planned ultimate B200-class GPUs at site
  • Most-mentioned company: Anthropic

Compiled by AI Feed’s editor from verified web sources for 22 May 2025.