Alibaba Releases Qwen2.5 Omni, Provides Voice and Video Modes to Qwen Chat

Qwen-2.5-is-Winning-the-AI-Agents-Race

Alibaba, on Wednesday, added voice and video chat capabilities to Qwen Chat, apart from releasing its model new open-source mannequin, Qwen2.5-Omni-7B, which made this potential. It was launched as an open-source mannequin underneath Apache 2.0 licence.

The corporate highlighted in a weblog publish that Qwen2.5-Omni is the brand new flagship end-to-end multimodal mannequin within the Qwen collection. It said that it’s designed for multimodal notion and seamlessly processes textual content, pictures, audio, and video, delivering real-time streaming responses by way of textual content and speech synthesis.

The important thing options of the mannequin embody a ‘Thinker-Talker’ structure, which permits it to offer real-time responses. The Thinker a part of the structure is a Transformer decoder, which acts just like the mind and the Talker, designed as a dual-track autoregressive Transformer decoder, operates just like the human mouth.

Alibaba’s Qwen2.5-Omni mannequin has proven robust efficiency throughout varied duties, together with speech recognition, translation, audio and video understanding, and speech era, outperforming comparable fashions at duties that require a number of modalities.

It was in comparison with comparable single-modality and closed-source fashions like Qwen2.5-VL-7B, Qwen2-Audio, and Gemini-1.5-pro, reaching state-of-the-art efficiency.

Voice Chat + Video Chat! Simply in Qwen Chat (https://t.co/FmQ0B9tiE7)! Now you can chat with Qwen similar to making a cellphone name or making a video name! Examine the demo in https://t.co/42iDe4j1Hs
What's extra, we opensource the mannequin behind all this, Qwen2.5-Omni-7B, underneath the… pic.twitter.com/LHQOQrl9Ha

— Qwen (@Alibaba_Qwen) March 26, 2025

The paper and code for the brand new mannequin will be discovered on GitHub, whereas the AI mannequin is obtainable on Hugging Face together with a demo.

Final month, Alibaba additionally launched QwQ-Max-Preview, a brand new AI reasoning mannequin throughout the Qwen household that specialises in arithmetic and coding duties and encompasses a “considering” functionality within the Qwen Chat software.

The mannequin, which outperformed OpenAI’s fashions on the LiveCodeBench leaderboard, is anticipated to have smaller variants open-sourced for native machine deployment, in addition to a devoted cellular app.

There could also be much more coming, contemplating Alibaba’s dedication to investing over $52 billion in AI over the following three years.

The publish Alibaba Releases Qwen2.5 Omni, Provides Voice and Video Modes to Qwen Chat appeared first on Analytics India Journal.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...