World Pilot: Steering Vision-Language-Action Models with World-Action Priors
Vision-Language-Action (VLA) models inherit semantic grounding from large-scale pretraining and perform competently across in-distribution manipulation tasks. This grounding, howev...