AI Daily Brief — 24 March 2025
A signature day for Chinese open-source. DeepSeek dumped a 685B-parameter V3 update onto Hugging Face with no model card, an empty README, and an MIT license — and the benchmarks vaulted across math, code, and reasoning. Alibaba shipped Qwen2.5-VL-32B-Instruct under Apache 2.0 the same day. US frontier labs stayed silent.
Top stories
- DeepSeek-V3-0324 quietly drops under MIT license. 685B-parameter MoE checkpoint published to Hugging Face with no model card and an empty README. Total file size 641 GB. The license change — from a custom DeepSeek license to MIT — matches R1’s terms and unlocks unrestricted commercial use. Discovered within hours by Simon Willison and Techmeme. via Simon Willison
- Benchmark jumps: MMLU-Pro +5.3, GPQA +9.3, AIME +19.8, LiveCodeBench +10.0. Versus original V3: MMLU-Pro 75.9 → 81.2, GPQA 59.1 → 68.4, AIME 39.6 → 59.4, LiveCodeBench 39.2 → 49.2. MATH-500 hit 94%, beating GPT-4.5 and Claude 3.7 Sonnet. Reasoning, coding, function calling, and Chinese-language writing all improved — gains came from a revamped post-training pipeline that ported RL techniques from R1, not architectural changes. via Hugging Face
- V3-0324 runs on a single Mac Studio at ~20 tok/s. Simon Willison demonstrated the 641GB model running on a $10K consumer-grade 512GB M3 Ultra Mac Studio using the 352GB 4-bit quantized version via MLX. The combination of MIT license + consumer-hardware feasibility was immediately framed as direct pressure on closed labs. via VentureBeat
- Alibaba open-sources Qwen2.5-VL-32B-Instruct under Apache 2.0. Successor to the Qwen2.5-VL series, optimized with reinforcement learning at the 32B scale. Reported to outperform Mistral-Small-3.1-24B and Gemma 3 27B, and to beat the larger Qwen2-VL-72B-Instruct on key VL benchmarks. Stronger math reasoning and improved fine-grained image understanding. via Qwen
- Function calling, JSON output, FIM completion ship natively in V3-0324. Per the DeepSeek API change log: better style and content quality in medium-to-long-form writing aligned with R1’s prose, more thorough Chinese rewriting, optimized translation, improved multi-turn interactive rewriting. Tool-use and front-end web/coding tasks called out as targets of the post-training upgrade. via DeepSeek
Who shipped
DeepSeek and Alibaba — back-to-back Chinese open-weights drops within hours of each other. NVIDIA, OpenAI, Anthropic, Google DeepMind, xAI and Meta all stayed quiet. After Tencent’s Hunyuan-T1 on Friday, the early-2025 open-weights cadence is unambiguously dominated by Chinese labs.
Open-source pulse
Both V3-0324 (MIT) and Qwen2.5-VL-32B (Apache 2.0) shipped under permissive licenses — explicitly inviting commercial use. The license switch may be the bigger story than the benchmark jumps: a 685B-parameter MIT-licensed model is the most aggressive open release any lab has made in 2025.
Money, infra & hardware
The Mac Studio demo reframed the cost-of-frontier-inference question — a 685B MoE running at 20 tok/s on consumer hardware undercuts the hyperscaler-only narrative around frontier inference.
Quiet corners
The NSF Workshop on the Future of AI and the Mathematical & Physical Sciences opened in Cambridge for a three-day run (Mar 24-26), framing AI-for-science and science-for-AI as inseparable items for US federal research funding. No US frontier-lab announcements specifically dated Mar 24.
By the numbers
- 685B — total parameters (671B main + 14B MTP module)
- 37B — active parameters per token
- 641 GB — total file size; 352 GB 4-bit quantized
- 20 tok/s — on a $10K Mac Studio via MLX
- +19.8 AIME, +10.0 LiveCodeBench, +9.3 GPQA — benchmark jumps vs original V3
- 94% — MATH-500, beating GPT-4.5 and Claude 3.7 Sonnet
- Most-mentioned company: DeepSeek
Compiled by AI Feed’s editor from verified web sources for 24 March 2025.