HF Daily Papers
· Papers
Beyond NL2Code: A Structured Survey of Multimodal Code Intelligence
While Large Language Models (LLMs) have substantially advanced text-to-code synthesis, many real programming tasks specify intent through visual artifacts such as screenshots, charts, vector drawings, videos, and interactive states. These tasks require models to connect visual perception to executab