r/LocalLLaMA June 24, 2026 · Communities

Gemma 4 26BA4B Surprisingly Usable at IQ3_S – Are small quants really this usable?

I've been experimenting with using lower quants of Gemma 4 26B on my M3 16gb MacBook Air. The Quant runs at a solid 25 tokens per second decoding and is really close to the bf16 for my use cases (No coding, tool calling). Do I have confirmation bias or are UD Q3 quants surprisingly good? Anyhow, huge props to the Unslo

Read original