r/MachineLearning
· Communities
Built an LLM training framework that actually runs on older GPUs without crashing [P]
Hey guys, I was playing around with Nanotron recently and got super frustrated by how many heavy, hardware-specific dependencies it imports at the module level ( flash-attn , triton, functorch , etc.). If you try to run it on older or budget GPUs like a T4 or V100, it just crashes on import. So I wrote Picotron (https: