arXiv cs.CL
· Papers
Cliff Tokens: Identifying Single-Token Failure Triggers in LLM Mathematical Reasoning
arXiv:2606.25524v2 Announce Type: replace-cross Abstract: Large language models (LLMs) reach high accuracy in mathematical reasoning, but individual traces on the same problem diverge; some arrive at the correct answer while others fail. Prior work analyzes failure at the step, chunk, or sentence level, or at tokens wh