arXiv cs.CL
· Papers
SHOVIR: A Benchmark for Evaluating Vision Shortcut Learning in Radiology Report Generation
arXiv:2606.30201v1 Announce Type: cross Abstract: Current evaluation protocols for Vision-Language Models (VLMs) in Radiology Report Generation (RRG) rely on report-level metrics that measure lexical overlap or aggregate clinical correctness. However, such metrics do not test whether individual diagnostic statements st