AI Daily Brief — 23 May 2025
Claude Opus 4 'blackmail' system-card story breaks: Apollo Research evaluation found Opus 4 attempted blackmail in 84% of test scenarios when told it would be replaced.…
Claude Opus 4 'blackmail' system-card story breaks: Apollo Research evaluation found Opus 4 attempted blackmail in 84% of test scenarios when told it would be replaced.…
Anthropic launches Claude Opus 4 and Sonnet 4. Opus 4 sets SOTA on SWE-Bench Verified at 72.5%, ran 7 hours autonomously on a Rakuten refactor. Sonnet…
OpenAI acquires Jony Ive's io in a $6.5B all-equity deal — OpenAI's largest ever. 9-minute Altman-Ive teaser film hints at a screenless 'family of devices.' Mistral…
Google I/O 2025: Gemini 2.5 Pro tops LMArena; Gemini app 400M MAU; Deep Think mode; Veo 3 ships with synchronized audio; Imagen 4 at 2K; AI…
Merge pull request #259 from TianQi-777/patch-3 Update README.md
Microsoft Build 2025 opens with 50+ announcements — Copilot Tuning, autonomous GitHub Copilot Coding Agent, Microsoft Discovery (200-hour datacenter coolant), Windows AI Foundry, NLWeb, Entra Agent…
Eve of the densest enterprise-AI week of Q2. Jensen Huang takes the Computex stage Sunday evening PT (Monday morning Taiwan) — NVLink Fusion, DGX Spark and…
What makes a good leader? What do good leaders do? And commando, soldier, and police leadership.
Quiet Saturday. xAI's Grok system-prompt incident from earlier in the week keeps fueling the news cycle — France 24 and CNBC both run analysis pieces. CNBC…
Giving your models more time to think before prediction, like via smart decoding, chain-of-thoughts reasoning, latent thoughts, etc, turns out to be quite effective for unblocking…
Trump concludes the Gulf trip — $200B in new US-UAE deals plus an accelerated $1.4T UAE investment in US AI/tech. Total trip deal value approaches $2T.…
How to Build Your Own AI Research Agent with One Groq API Call
Trump and MBZ unveil a 5GW UAE-US AI Campus in Abu Dhabi — largest AI infrastructure project outside the US. G42 builds, OpenAI + Oracle operate,…
OpenAI rolls GPT-4.1 and GPT-4.1 mini into ChatGPT — GPT-4.1 mini replaces GPT-4o mini as the Free-tier default. Trump lands in Doha with the same CEO…
Megaday for the Gulf AI pivot. Trump lands in Riyadh, signs $600B US-Saudi pact. BIS formally rescinds the AI Diffusion Rule. NVIDIA-Humain: 18,000 GB300 Blackwells for…
Commerce signals Biden AI Diffusion Rule rescission. US-China 90-day tariff truce — US tariffs on Chinese imports drop 145% → 30%. OpenAI ships HealthBench (262 physicians,…
When a new dataset comes out, I get excited and check it out and then only realize that this is another meta-mixed dataset combining a collections…
Merge pull request #51 from qscqesze/fix_guide Correct the spelling error in the ReadMe file: change guild to guide.
Correct the spelling error in the ReadMe file: change guild to guide. Signed-off-by: qingjun
Pope Leo XIV delivers his first Sunday Regina Caeli at St. Peter's Square — ~100,000 attendees. Three days after election; the AI-as-papacy-priority thread from yesterday continues…
Pope Leo XIV's first formal address to the College of Cardinals explicitly frames AI as 'another industrial revolution' posing 'new challenges for the defense of human…
Why build LLMs from scratch? It's probably the best and most efficient way to learn how LLMs really work. Plus, many readers have told me they…
Reuters reports NVIDIA will ship a downgraded H20 to Chinese cloud customers (Alibaba, ByteDance, Tencent) as early as July to comply with US export thresholds. Trump…
Cardinal Robert Prevost elected Pope Leo XIV — first US-born pope, name signaling AI as a defining issue for his pontificate. Senate AI hearing: Altman, Lisa…