arXiv cs.CV
· Papers
REALM: A Unified Red-Teaming Benchmark for Physical-World VLMs
arXiv:2606.23892v1 Announce Type: new Abstract: Vision-language models (VLMs) are increasingly used as perception-reasoning backbones for embodied intelligence in safety-critical physical systems, where perception or reasoning errors can lead to unsafe decisions or actions. Although many red-teaming methods have been d