r/LocalLLaMA
· Communities
Biggest model that is capable which can fit under 64 gb vram for the purpose of distillation
hi all, I have 64 gb VRAM, and I am looking for biggest model that I can use to distill prefer a reasoning model. even with 12 tokens per second I am happy, a 72 b model can fit in my machine, I have dual r9700, dont have speed but got the memory submitted by /u/AppropriatePush6262 [link] [comments]