X · @teortaxesTex
· X / Twitter
I want to bring your attention to JetSpec because it looks strictly smarter and stronger than previous speculative decoding and block diffusion approa…
I want to bring your attention to JetSpec because it looks strictly smarter and stronger than previous speculative decoding and block diffusion approaches (yes, again).Avg 1000 t/s single stream with Qwen-8B on B200. Basically, you can better utilize compute at any batch size.Hao AI Lab: Introducing JetSpec: we find sp