Skip to content
arXiv cs.LG · Papers

Teaching Diffusion to Speculate Left-to-Right

arXiv:2606.11552v2 Announce Type: replace-cross Abstract: Large language models (LLMs) achieve remarkable performance across a wide range of tasks, but their autoregressive decoding process incurs substantial inference costs due to inherently sequential token generation. Speculative decoding addresses this bottleneck b