arXiv cs.CV
· Papers
Reflective VLA: In-Context Action Consequences Make VLAs Generalize
arXiv:2606.25215v1 Announce Type: new Abstract: Most vision-language-action (VLA) models are reactive: they predict the next action from the current instruction and observation, implicitly assuming that the current observation fully specifies the action-relevant state. In embodied control, however, embodiment-specific