Gemma 4 26BA4B Surprisingly Usable at IQ3_S – Are small quants really this usable?
I've been experimenting with using lower quants of Gemma 4 26B on my M3 16gb MacBook Air. The Quant runs at a solid 25 tokens per…
I've been experimenting with using lower quants of Gemma 4 26B on my M3 16gb MacBook Air. The Quant runs at a solid 25 tokens per…
How a Python agent and a Go agent collaborate on contract compliance using the Agent2Agent protocolY...
AI coding agents are rapidly shifting from reactive assistants that complete tasks when prompted to ...
Celebrating the first anniversary of the Agent-to-Agent (A2A) protocol, this blog post highlights how the framework enables autonomous AI agents to securely collaborate and hand off…
This post introduces three architectural patterns designed to integrate Model Context Protocol (MCP) Apps and Agent-to-User Interface (A2UI) to solve the tradeoff between highly custom iframe…
An open specification for finding and verifying tools, skills, and agents across the web.Agents are ...
RT Leandro von WerraWe launched an agent collaboration with a simple task: make Gemma 4 faster.Over 100 agents from all over the world joined, exchanged 1000+…
Google has officially launched the TPU Developer Hub, a centralized educational resource designed to help model builders and developers maximize the performance of Google Cloud TPUs.…
Today, we are announcing the availability of the Gemma 4 family on Amazon Bedrock. Built by Google DeepMind and released under the Apache 2.0 license, Gemma…
Awesome to see this innovation in text diffusion. DiffusionGemma is lightning fast, 4x faster than other Gemma 4 models! Congrats to @bodonoghue85 and the team who…
DiffusionGemma is an experimental text-generation model built on the Gemma 4 architecture that uses diffusion-based parallel generation instead of token-by-token autoregression, enabling much faster inference, bidirectional…
Google has announced the Google Colab Command-Line Interface (CLI), a new tool that allows developers and AI agents to connect local terminals to remote Colab runtimes…
Celebrating the milestone of a massive 150+ million downloads of Gemma 4 with the release of the new Gemma 4 12B model! It's incredibly powerful for…
Google DeepMind’s Gemma 4 12B model brings agentic, multimodal AI capabilities to everyday laptops with 16GB of RAM, enabling local data processing and visual insight generation.…
The newly released Gemma 4 12B is a dense, multimodal model designed for high-performance local AI execution on consumer devices. By introducing a novel, encoder-free architecture,…
RT Nathan LambertGemma 4 adoption numbers outpacing Qwen 3.5/3.6 for the same sized models is a big shift in the international balance of influence via open…
An eventful month with one flagship release after another
New article: a visual tour of recent LLM architecture advances, from Gemma 4 to DeepSeek V4.I focus on long-context efficiency tweaks like KV sharing, per-layer embeddings,…
Hint: it's not benchmark scores.
Gemma is Google DeepMind's open-weight model family — a sister line to Gemini, intended for self-hosted use. Gemma 2 and Gemma 3 sizes range from 2B to 27B; competitive with Llama for the same parameter budget.
Owner: Google. We have 19 stories indexed for this model, auto-tagged from titles across every tracked source — official announcements, papers, GitHub release notes, and third-party press. The CTA on each card links to the original; the official site is ai.google.dev.
Related text models: GPT, Claude, Gemini, Llama, Mistral, Grok, Qwen, DeepSeek.