Alibaba Releases Qwen2.5 Omni, Provides Voice and Video Modes to Qwen Chat

Qwen-2.5-is-Winning-the-AI-Agents-Race

Alibaba, on Wednesday, added voice and video chat capabilities to Qwen Chat, apart from releasing its model new open-source mannequin, Qwen2.5-Omni-7B, which made this potential. It was launched as an open-source mannequin underneath Apache 2.0 licence.

The corporate highlighted in a weblog publish that Qwen2.5-Omni is the brand new flagship end-to-end multimodal mannequin within the Qwen collection. It said that it’s designed for multimodal notion and seamlessly processes textual content, pictures, audio, and video, delivering real-time streaming responses by way of textual content and speech synthesis.

The important thing options of the mannequin embody a ‘Thinker-Talker’ structure, which permits it to offer real-time responses. The Thinker a part of the structure is a Transformer decoder, which acts just like the mind and the Talker, designed as a dual-track autoregressive Transformer decoder, operates just like the human mouth.

Alibaba’s Qwen2.5-Omni mannequin has proven robust efficiency throughout varied duties, together with speech recognition, translation, audio and video understanding, and speech era, outperforming comparable fashions at duties that require a number of modalities.

It was in comparison with comparable single-modality and closed-source fashions like Qwen2.5-VL-7B, Qwen2-Audio, and Gemini-1.5-pro, reaching state-of-the-art efficiency.

Voice Chat + Video Chat! Simply in Qwen Chat (https://t.co/FmQ0B9tiE7)! Now you can chat with Qwen similar to making a cellphone name or making a video name! Examine the demo in https://t.co/42iDe4j1Hs
What's extra, we opensource the mannequin behind all this, Qwen2.5-Omni-7B, underneath the… pic.twitter.com/LHQOQrl9Ha

— Qwen (@Alibaba_Qwen) March 26, 2025

The paper and code for the brand new mannequin will be discovered on GitHub, whereas the AI mannequin is obtainable on Hugging Face together with a demo.

Final month, Alibaba additionally launched QwQ-Max-Preview, a brand new AI reasoning mannequin throughout the Qwen household that specialises in arithmetic and coding duties and encompasses a “considering” functionality within the Qwen Chat software.

The mannequin, which outperformed OpenAI’s fashions on the LiveCodeBench leaderboard, is anticipated to have smaller variants open-sourced for native machine deployment, in addition to a devoted cellular app.

There could also be much more coming, contemplating Alibaba’s dedication to investing over $52 billion in AI over the following three years.

The publish Alibaba Releases Qwen2.5 Omni, Provides Voice and Video Modes to Qwen Chat appeared first on Analytics India Journal.

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments