arXiv cs.CV June 24, 2026 · Papers

Mind the Heads: Topological Representation Alignment for Multimodal LLMs

arXiv:2606.23885v1 Announce Type: new Abstract: Representation alignment has emerged as an effective approach to improve Multimodal Large Language Models (MLLMs) by regularizing their internal representations toward those of an external vision encoder. However, existing methods typically align a fixed layer of the lang

Read original