Amazon Science
· Cloud & Big Tech
Diverse reasoning traces teach LLMs to make better decisions
How to train language models to generate diverse, accurate reasoning paths using tokens that control distinct reasoning strategies.