AI Daily Brief — 26 April 2025
Quiet Saturday. The GPT-4o sycophancy update from Friday is live in ChatGPT and the viral screenshots start spreading. Alibaba's Qwen3 family is two days away. No…
Quiet Saturday. The GPT-4o sycophancy update from Friday is live in ChatGPT and the viral screenshots start spreading. Alibaba's Qwen3 family is two days away. No…
Manus AI (Butterfly Effect) raises $75M Series B led by Benchmark at $500M valuation — 5x its prior price. US VC backing a Chinese AI agent…
Perplexity lands on Motorola Razr — its first smartphone OEM. Adobe MAX London ships Firefly Image Model 4 + 4 Ultra and 100+ Creative Cloud AI…
The Urgency of Interpretability: Why it's crucial that we understand how AI models work https://www.darioamodei.com/post/the-urgency-of-interpretability
OpenAI ships gpt-image-1 in the API — natively multimodal image gen for developers worldwide. 130M users generated 700M+ images in the first week of the consumer…
EU AI Office publishes preliminary GPAI guidelines — 10^22 FLOP training-compute threshold for presumed GPAI status, August 2 enforcement deadline. Google VP confirms in antitrust trial…
Pope Francis dies on Easter Monday — surfacing his AI ethics legacy (Rome Call, G7 address, Antiqua et Nova). Demis Hassabis on 60 Minutes: AGI in…
Western Easter Sunday — quiet day. Pope Francis makes an unannounced appearance for the Urbi et Orbi blessing — his final public appearance before his death…
Applying the scientific method, building via eval-driven development, and monitoring AI output.
Western Easter Saturday — genuinely quiet day. No frontier-lab releases, no funding rounds, no papers. The NVIDIA H20 export-control fallout continues to dominate weekend commentary. The…
Understanding GRPO and New Insights from Reasoning Model Papers
Good Friday — but not quiet. Jensen continues his Beijing meetings. Perplexity inks a Motorola Razr distribution deal — its first major smartphone OEM win. OpenAI…
Claude Code is a command line tool for agentic coding. This post covers tips and tricks that have proven effective for using Claude Code across various…
Google launches Gemini 2.5 Flash in preview — first fully hybrid reasoning model with togglable thinking and budget control. Jensen Huang lands in Beijing one day…
OpenAI's biggest single day of 2025 so far: o3 and o4-mini ship with agentic tool use across web search, Python, image gen and image manipulation inside…
NVIDIA discloses $5.5B Q1 FY26 charge in 8-K after-hours — H20 China export license is now indefinite. Stock plunges after-hours. AMD files 8-K for up to…
A new paper that we will expand into our next book
Now in Preview: Groq’s First Compound AI System
OpenAI launches GPT-4.1 family — three API-only models with 1M context. GPT-4.1 hits 54.6% SWE-Bench Verified (vs GPT-4o 33.2%) and 90.2% MMLU. Mini ~83% cheaper than…
Sam Altman teases 'a lot of good stuff' coming this week — kicking off Monday. UNCTAD's 2025 Tech & Innovation Report continues to circulate, projecting AI…
Quiet Saturday between Llama 4 and GPT-4.1. ChatGPT memory rollout continues for Pro and Plus subscribers globally. Mira Murati's Thinking Machines Lab $2B seed reporting dominates…
Sam Altman live at TED2025: ChatGPT crosses 800M weekly active users. Vanilla Llama 4 Maverick lands #32 on LMArena — far below GPT-4o, Claude 3.5 Sonnet,…
Recent advances in Large Language Models (LLMs) enable exciting LLM-integrated applications. However, as LLMs have improved, so have the attacks against them. Prompt injection attack is…
Markets digest the historic +9.5% S&P / +12.2% Nasdaq rally. ChatGPT memory now references all past conversations — Pro first, Plus next. Sam Altman teases o3…