X · @ollama June 24, 2026 · X / Twitter

RT jietang: 5.2 could be better with more RL …

RT jietang5.2 could be better with more RL ...青龍聖者: Deepswe's benchmark results are my own experience.I've used all models, GLM 5.2 ≈ Claude Opus 4.6–4.7.Kimi 2.7 code more like inference optimization.Looking forward to K3.Doubao-seed 2.1 Pro around 37% ≈ Gemini 3.5 Flash.code are quite weak, but visual are strong.

Read original