X · @teortaxesTex
· X / Twitter
Extending DSpark to GDN-based models will be huge I hope Kimi team in K3 uses it (or some more advanced acceleration in this genre) from the get-go. T…
Extending DSpark to GDN-based models will be hugeI hope Kimi team in K3 uses it (or some more advanced acceleration in this genre) from the get-go. This also speeds up RL…Hikari∣LocalLLM⚡: Progress report!Training of the DFlash backbone and markov head is complete, enabling DSpark to be used on 27B. We will now train t