ElevenLabs Introduces Real-Time Streaming for Text-to-Speech, Offers Multilingual Experience Similar to Google’s Bard

In a significant stride, ElevenLabs has unveiled a new feature – input streaming for generating speech in real-time with remarkable sub-1-second latency. This cutting-edge capability, available via the ElevenLabs platform, enables users to listen to Large Language Model (LLM) responses as they’re being crafted.

We have just released input streaming, which allows you to stream LLM responses and generate speech in real-time – all possible with sub-1-second latency.
Try it today: https://t.co/JcUgx0BElg https://t.co/uvnUOU8t0q

— ElevenLabs (@elevenlabsio) August 7, 2023

Elevating the experience further, ElevenLabs introduces the eleven_multilingual_v1 model, presenting an array of voices that breathe life into content. This model supports a diverse range of languages, including English, German, Polish, Spanish, Italian, French, Portuguese, and Hindi.

With just a few lines of code, creators and developers can harness the richness of these voices, painting a captivating auditory landscape. In addition to this, you have the option to choose different voices and even clone your own voice.

Interestingly, the tool’s features bear a resemblance to Google’s Bard, a multilingual text-to-speech marvel. Bard’s expansion to 40 new languages, including Arabic, Chinese, German, and Spanish, has broadened its global reach.

Both ElevenLabs and Bard cater to a multilingual audience, offering spoken outputs across various languages.While Bard flaunts Google’s efforts in nurturing it with extensive content to ensure accuracy, ElevenLabs opens doors to real-time text streaming, providing a dynamic and immediate auditory experience.

Whether exploring pronunciation or simply relishing the auditory rendition, ElevenLabs and Bard create a symphony of linguistic possibilities for users worldwide.

Interestingly, ChatGPT from OpenAI lacks a built-in text-to-speech model, leaving a notable gap in its capabilities. It seems this is the one element yet to be included in OpenAI’s toolkit. Perhaps a cue could be taken from ElevenLabs, which has introduced innovative features in the same arena. Unlike Whisper API, which facilitates speech-to-text, OpenAI hasn’t rolled out a comparable API.

The post ElevenLabs Introduces Real-Time Streaming for Text-to-Speech, Offers Multilingual Experience Similar to Google’s Bard appeared first on Analytics India Magazine.