arXiv cs.CL June 26, 2026 · Papers

Cliff Tokens: Identifying Single-Token Failure Triggers in LLM Mathematical Reasoning

arXiv:2606.25524v2 Announce Type: replace-cross Abstract: Large language models (LLMs) reach high accuracy in mathematical reasoning, but individual traces on the same problem diverge; some arrive at the correct answer while others fail. Prior work analyzes failure at the step, chunk, or sentence level, or at tokens wh

Read original