arXiv cs.LG June 24, 2026 · Papers

Teaching Diffusion to Speculate Left-to-Right

arXiv:2606.11552v2 Announce Type: replace-cross Abstract: Large language models (LLMs) achieve remarkable performance across a wide range of tasks, but their autoregressive decoding process incurs substantial inference costs due to inherently sequential token generation. Speculative decoding addresses this bottleneck b

Read original