ML @ CMU Papers 5 days ago Healthcare Benchmarks Are Only as Good as Their Assumptions In healthcare settings where patients use LLMs as a medical assistant, LLM performance differs between evaluation and deployment. (a) Bean et al. (2025) find a 61…