arXiv cs.LG
· Papers
Teaching Diffusion to Speculate Left-to-Right
arXiv:2606.11552v2 Announce Type: replace-cross Abstract: Large language models (LLMs) achieve remarkable performance across a wide range of tasks, but their autoregressive decoding process incurs substantial inference costs due to inherently sequential token generation. Speculative decoding addresses this bottleneck b