HF Daily Papers
· Papers
VeriEvol: Scaling Multimodal Mathematical Reasoning via Verifiable Evol-Instruct
Scaling reinforcement learning for visual mathematical reasoning requires more than generating harder questions: as data volume grows, the reward labels themselves must remain reliable. Yet existing data pipelines scale supervision while trusting the labeller, and policy-side methods assume the unde