Alibaba's Qwen team has unveiled Qwen3.5-LiveTranslate-Flash, a real-time multimodal translation model processing audio ...

Alibaba's Qwen team has unveiled Qwen3.5-LiveTranslate-Flash, a real-time multimodal translation model processing audio and video simultaneously. The model covers 60 input languages and produces speech output in 29 languages at just 2.8 seconds latency. Key features include real-time speaker voice cloning and vision-enhanced comprehension via lip movements. Available via API on Alibaba Cloud. https://www.marktechpost.com/2026/05/20/alibaba-qwen-team-introduces-qwen3-5-livetranslate-flash-real-time-multimodal-interpretation-across-60-languages-at-2-8-second-latency/ #AIagent #AI #GenAI #AIResearch

Read Original

Related