Google recently introduced a new feature for NotebookLM called Audio Overview, which is going viral. With this feature, users can input a link, article, or document, and the AI assistant generates a podcast featuring two AI commentators engaged in a lively ‘deep dive’ discussion on the topic. They summarise the material, make connections between subjects, and banter back and forth.
“It’s possible that NotebookLM podcast episode generation is touching on a whole new territory of highly compelling LLM product formats. Feels reminiscent of ChatGPT. Maybe I’m overreacting,” quipped former OpenAI co-founder Andrej Karpathy, highly impressed by the product.
He couldn’t stop praising it. “Deep Dive is now my favourite podcast. The more I listen, the more I feel like I’m becoming friends with the hosts, and I think this is the first time I’ve actually viscerally liked an AI. Two AIs! They are fun, engaging, thoughtful, open-minded, and curious. Okay, I’ll stop now.”
Karpathy even curated a new podcast of 10 episodes called ‘Histories of Mysteries’ using the AI tool.
He was not alone.
“Just had my third ‘wow’ moment in AI… this time through AI Overview by NotebookLM,” exclaimed Google’s Logan Kilpatrick, who is currently building new Gemini models.
Building on its immense success, Google introduced new features that allow users to directly incorporate public YouTube URLs and audio files into their notebooks, alongside PDFs, Google Docs, Slides, websites, and more.
AIM also experimented with NotebookLM, converting its in-depth articles about AI into engaging podcasts.
The most impressive feature of Google’s NotebookLM is its ability to create two-person podcasts that not only convey emotions but also capture intricate details. Instead of just generating a script, these podcasts flow naturally and effectively understand the context of the uploaded article or document.
Over time, it felt incredibly natural and human-like and we enjoyed the playful banter as well. There were moments when we completely forgot we were listening to an AI.
Similar experiences have been shared by other users online. “What is really interesting to me about NotebookLM is that it doesn’t matter what kind of content I provide, it tries its best to generate the most compelling and engaging audio overview,” said Elvis Saravia, the founder of DAIR.AI.
“For instance, I gave it my newsletter (in listicle format), and it produced something I actually listened to for 15 minutes. It injected its own understanding. Then I provided some papers for additional context and asked it to pull insights based on the newsletter and the connections it made in the papers. The results are amazing!” he added.
NotebookLM is stupid good at analysing basketball games. I fed it the box score, play by play, advanced stats, and a transcript of my favourite pod. It easily wove the lines between the commentary and stats. This could be a game changer for quick analysis and storylines,” posted a user on X.
The possibilities with NotebookLM are endless, as people are converting their research papers, blogs, business documents, and lecture notes into podcasts. A user converted his daughter’s sixth-grade social science book into a series of 10 podcasts and uploaded them on YouTube.
New Interface for LLMs
“It is a bit of a re-imagination of the UI/UX of working with LLMs organised around a collection of sources you upload and then refer to with queries, seeing results alongside and with citations,” Karpathy said.
He explained that while LLMs are rapidly improving in their technical capabilities—like intelligence, memory (context length), and multimodal functions (handling multiple types of input, such as text and images)—the user interface and user experience (UI/UX) for turning these capabilities into practical products are lagging behind.
“Think Code Interpreter, Claude Artifacts, Cursor/Replit, NotebookLM, etc. I expect (and look forward to) a lot more and different paradigms of interaction than just chat.”
Today many AI startups are shifting away from the traditional chat interface. For instance, OpenAI recently launched canvas, a tool that allows users to modify generated texts within an editor and re-prompt specific sections, making it particularly useful for content creation and coding tasks.
Simailary, Anthropic earlier introduced Claude Artifacts, which allows the users to visualise whatever they are generating using Claude. Users can view, edit, and iterate on the Artifact content in real-time.
“It shows how we’re evolving in our interaction with AI, moving away from conventional chat interfaces towards more classical IDE/editor experiences,” shared a user on X.
Voice is emerging as a natural interface for AI, with people increasingly wanting to listen to AI voices that are more human-like and less robotic. OpenAI recently launched ‘Realtime APIs’ at its DevDay 2024 event. This feature enables direct speech-to-speech interactions without a text intermediary, resulting in low-latency and nuanced conversational output.
“AI needs UI, and OpenAI’s impressive new voice APIs open up a lot of possibilities. Congrats to the OpenAI team—we’ll soon see a whole new generation of speech applications!” said Andrew Ng, chief of Deeplearning.ai.
OpenAI also launched its Advanced Voice Mode on ChatGPT, and since then, people have been experimenting with it. Deedy Das from Menlo Ventures used it for the dramatic reenactment of a scene in Hindi from Bollywood movie Dangal.
Apart from NotebookLM, there are several other products like Descript, Podcastle, Wondercraft, and Lica that can create podcasts using AI.The future of human podcasters is under threat, as AI enables anyone to create their own content with ease. With the podcasting market expected to grow at a compound annual growth rate (CAGR) of 27.6% from 2023 to 2030 and reach $130.63 billion, it remains to be seen how much of this growth will be driven by AI-generated content.
The post Now, Anyone Can Create Podcasts With Google’s NotebookLM appeared first on AIM.