r/LocalLLaMA
· Communities
Mimo 2.5 is _fast_ at large context (dual RTX Pro 6000)
For agentic work fast high context is king, OpenCode fills the window quickly and most models that feel snappy at 8k context turn into dial-up ADSL brrr by the time you're at 150k context deep. So I've been testing lots of models and runners trying to get "local Sonnet" on 2x RTX PRO 6000 (Spoiler, yes!). The drop-off