Dwarkesh Patel
· Newsletters
RLVR might be disproportionately bad at science
the verification loop for theories can be on the order of decades and centuries, and even then we know today as the better theory can often actually make worse predictions