RT clem 🤗: Kog open-sourced on @huggingface the 2B model that they used to show a model running at 3,000+ tokens per second. Very cool work! https:…
RT clem 🤗Kog open-sourced on @huggingface the 2B model that they used to show a model running at 3,000+ tokens per second. Very cool work! https://huggingface.co/blog/kogai/kog-laneformer-2b-the-latency-first-model