iVGR: Internalizing Visually Grounded Reasoning for MLLMs with Reinforcement Learning
While visually grounded Chain-of-Thought (CoT) has emerged as a promising paradigm to enhance fine-grained perception in multimodal large language models (MLLMs), its efficacy duri...