Skip to content
Chip Huyen · Tech Media

Generation configurations: temperature, top-k, top-p, and test time compute

ML models are probabilistic. Imagine that you want to know what’s the best cuisine in the world. If you ask someone this question twice, a minute apart, their answers both times should be the same. If you ask a model the same question twice, its answer can change. If the model thinks that Vietnamese cuisine has a 70% c