New article: a visual tour of recent LLM architecture advances, from Gemma 4 to DeepSeek V4. I focus on long-context efficiency tweaks like KV sharing…
New article: a visual tour of recent LLM architecture advances, from Gemma 4 to DeepSeek V4.I focus on long-context efficiency tweaks like KV sharing, per-layer embeddings,…