cheerupzhu/PPM_VLA: This work presents an efficient physics-based Vision-Language-Action (VLA) approach that integrates Vision-Language Models (VLMs) with diffusion models to generate trajectory predictions with enhanced physical realism.
This work presents an efficient physics-based Vision-Language-Action (VLA) approach that integrates Vision-Language Models (VLMs) with diffusion models to generate trajectory predi...