Skip to content
Source · Daily Brief

AI Daily Brief — 01 March 2025

Saturday closed DeepSeek’s batched release week with a “One More Thing” surprise that doubled as a market-economics gut punch. The lab published a technical write-up of the production inference system behind V3 and R1: cross-node expert-parallel batch scaling, computation-communication overlap, load balancing — hitting 73,700 input and 14,800 output tokens per second per H800 node. The same write-up disclosed cost-revenue math for the first time: at $2/hr GPU rental the cluster costs roughly $87,072/day; priced entirely at R1 rates, theoretical daily revenue would be roughly $562,000 — implying a 545% theoretical cost-profit margin. Caveats included that V3 is cheaper than R1, off-peak discounts exist, and most traffic isn’t monetized. The number landed as a direct counterpoint to GPT-4.5’s $75/$150 per million tokens pricing from three days prior.

Top stories

  • DeepSeek closes Open Source Week with V3/R1 Inference System Overview. “One More Thing” Day 6: cross-node expert-parallel batch scaling, computation-communication overlap, load balancing. 73.7k input / 14.8k output tokens/s per H800 node. via DeepSeek on X · via DeepSeek GitHub
  • DeepSeek discloses 545% theoretical cost-profit margin. At $2/hr GPU rental: cluster costs ~$87,072/day; priced entirely at R1 rates, theoretical daily revenue ~$562k. The lab flagged caveats — V3 is cheaper than R1, off-peak discounts exist, most traffic isn’t monetized — but the number landed as the weekend’s defining counterpoint to GPT-4.5 economics. via DeepSeek
  • 3FS continues as the weekend’s open-source talking point. The Fire-Flyer File System dropped Friday as Day 5: Linux-based AI-HPC file system hitting 7.3 TB/s aggregate read throughput on DeepSeek’s clusters, designed for the random-read patterns of LLM training. Tom’s Hardware deep-dive landed on Saturday. via Tom’s Hardware
  • GPT-4.5 weekend reviewer scrutiny. Three days after launch, consensus over the weekend was that the model has a ~30x API price premium over GPT-4o but only incremental gains — and that OpenAI’s own researchers were quoting o1 as proof that “pre-training isn’t the optimal place to spend compute in 2025.” Plus-tier rollout was still gated to Pro. via Axios

Who shipped

DeepSeek‘s Day 6 disclosure capped a six-day cadence. OpenAI, Anthropic, Google DeepMind, Meta, xAI, and Mistral made no dated Saturday launches.

Open-source pulse

Alibaba‘s Wan 2.1 held the VBench #1 at 86.22% heading into the weekend; the 1.3B variant running on 8 GB VRAM kept spreading. Microsoft‘s Phi-4-multimodal (5.6B, mixture-of-LoRAs) topped Hugging Face’s OpenASR leaderboard.

Quiet corners

arXiv announcement system paused for the weekend. The Anthropic Series E close — $3.5B at $61.5B post-money, Lightspeed-led — sat 48 hours away. The Manus AI launch, Mistral OCR, Cohere Command A, and Google Gemma 3 were all scheduled for the following week.

By the numbers

  • 73.7k / 14.8k tokens/s per H800 node — V3/R1 input / output throughput
  • 545% — DeepSeek theoretical cost-profit margin
  • ~$87K / ~$562K — daily cluster cost / theoretical revenue
  • 7.3 TB/s — 3FS aggregate read
  • Most-mentioned company: DeepSeek

Compiled by AI Feed’s editor from verified web sources for 1 March 2025.