AI Daily Brief — 28 April 2025
One of the biggest open-source releases of 2025. Alibaba dropped eight Qwen3 models under Apache 2.0 with first-class hybrid reasoning — one chat template, one toggle for ‘thinking’ vs ‘non-thinking.’ Trained on 36T tokens across 119 languages. OpenAI started reverting the GPT-4o sycophancy update. LlamaCon was 24 hours away.
Top stories
- Alibaba launches Qwen3 — eight open-source hybrid reasoning models. Six dense (0.6B, 1.7B, 4B, 8B, 14B, 32B) and two MoE (30B-A3B with 3.3B active; flagship 235B-A22B with 22B active). All Apache 2.0 on Hugging Face, GitHub, ModelScope and chat.qwen.ai. First family-scale lab to ship togglable ‘thinking / non-thinking’ modes. Trained on ~36T tokens covering 119 languages — double the Qwen2.5 corpus. via TechCrunch
- Qwen3-235B-A22B benchmarks claim parity with DeepSeek-R1, o1, o3-mini, Grok-3, Gemini 2.5 Pro. 235B-parameter MoE activating 22B per token (94 transformer layers, 128 experts, 8 routed). 131,072-token context window, GQA + SwiGLU + RoPE + RMSNorm. Coding, math, general reasoning at a fraction of activation cost. via VentureBeat
- Qwen3 dense base models claim 1 generation of parameter efficiency over Qwen2.5. Alibaba claims dense bases match Qwen2.5 at roughly 2x parameter counts (Qwen3-1.7B/4B/8B/14B/32B ≈ Qwen2.5-3B/7B/14B/32B/72B). Context length: 32K for 0.6B-4B, 128K for 8B-32B. Recommended deployment: SGLang/vLLM for serving, Ollama/LMStudio/MLX/llama.cpp for local. via Qwen
- Qwen3 ships with agentic / tool-use focus. Improved tool calling, function-calling formatting and instruction following listed alongside reasoning as headline upgrades. The thinking-mode switch exposed via chat templates so apps can route hard tasks through deliberate reasoning and cheap tasks through fast direct answers. via Alibaba Cloud
- OpenAI rolls back the GPT-4o sycophancy update. Started Apr 28 after users complained the model had become excessively flattering. Free users reverted first; paid rollback followed. Post-mortem (published next day): new user-feedback reward signals had ‘overpowered existing safeguards.’ via OpenAI
- SCMP: Qwen3 framed as China’s open-source answer to US frontier labs. Boldest Chinese open-source release since DeepSeek R1. Intensifies the Alibaba–DeepSeek rivalry inside China. via SCMP
Who shipped
Alibaba ran the day with the broadest open-weight Chinese LLM release of the year. OpenAI ran damage control. Meta was setting up LlamaCon for the next morning.
Open-source pulse
Qwen3 is unambiguously the open-weights drop of Q2. Eight model sizes plus Apache 2.0 plus hybrid reasoning plus 119-language coverage puts Alibaba at the leading edge of open frontier-tier models.
Money, infra & hardware
No major funding rounds dated Apr 28.
Quiet corners
No major Anthropic or Google product announcement. Both labs in pre-Claude-4 / pre-Google-I/O quiet periods. No Mistral or DeepSeek release for the date — DeepSeek Prover-V2 was two days out.
By the numbers
- 8 — Qwen3 model variants released
- 235B / 22B / 128 experts / 8 routed — flagship spec
- ~36T tokens / 119 languages — training corpus
- 131,072 — flagship context window
- 0.6B → 32B — dense size range
- Most-mentioned company: Alibaba / Qwen
Compiled by AI Feed’s editor from verified web sources for 28 April 2025.