cheerupzhu/PPM_VLA: This work presents an efficient physics-based Vision-Language-Action (VLA) approach that integrates Vision-Language Models (VLMs) with diffusion models to generate trajectory predictions with enhanced physical realism.

This work presents an efficient physics-based Vision-Language-Action (VLA) approach that integrates Vision-Language Models (VLMs) with diffusion models to generate trajectory predictions with enhanced physical realism.

Read Original

Related