Generate Music From Text Using Google MusicLM

Generate Music From Text Using Google MusicLM
Image by Freepik

AI development has become bigger than ever, especially in the Generative AI field. From generating text similar to a conversation with people to generating images from text, it’s all become possible now.

That advancement also comes into the music generation field, signified by Google, which launched a music generation model called MusicLM. This model was released in January 2023, and people have been trying out their capabilities since then. So, what is MusicLM in detail, and how can you try it out? Let’s discuss them.

Google MusicLM

MusicLM was first introduced in the paper by Agostinelli et al. (2023), where the research group explained MusicLM as a model to generate high-fidelity music from textual description. The model is generally built on top of AudioLM, and the experiments showed that the model could produce several minutes' worth of high-quality music at 24 kHz while still adhering to the text description.

Additionally, the research produces public text-to-music dataset musiccaps for anyone who wishes to develop a similar model or extend the research. The data is manually curated and hand-picked by professional musicians.

Also, MusicLM has been developed following responsible model development practices for people who fear the potential misappropriation of creative content because of the music generation. By extending the work of Carlini et al. (2022), the generated token by MusicLM is significantly different than the training data.

Trying Out MusicLM

If you want to explore the MusicLM result sample, the Google research group has provided a simple website for us to see how capable MusicLM is. For example, you can explore the generated audio samples from the text caption on the website.

Generate Music From Text Using Google MusicLM
Image by Author (Adapted from google-research.github.io)

Another example is my favorite sample, the Story Mode music generation, where different styles of music can be integrated into one by using several text prompts.

Generate Music From Text Using Google MusicLM
Image by Author (Adapted from google-research.github.io)

It’s also possible to generate music based on the Painting caption, possibly capturing the image's mood.

Generate Music From Text Using Google MusicLM
Image by Author (Adapted from google-research.github.io)

The result sounds amazing, but how can we try the model? Luckily, Google has accepted the registration to test the MusicLM since May 2023 in the AI Test Kitchen. Go to the website and sign up with your Google account.

Generate Music From Text Using Google MusicLM
Image by Author (Adapted from aitestkitchen)

After registration, we would need to wait for our turn to try out MusicLM. So, keeps your eyes on your email.

Generate Music From Text Using Google MusicLM
Image by Author (Adapted from aitestkitchen)

That is all for now; I hope you can get your turn soon to try out the exciting MusicLM.

Conclusion

MusicLM is a model by the Google Research group to generate music from a text. The model can provide several minutes of high-quality music while following the text instruction. We can try out MusicLM by signing up for the AI Test Kitchen. Although, we can visit the Google Research website if we are only interested in the sample result.
Cornellius Yudha Wijaya is a data science assistant manager and data writer. While working full-time at Allianz Indonesia, he loves to share Python and Data tips via social media and writing media.

More On This Topic

  • How To Generate Meaningful Sentences Using a T5 Transformer
  • How to Generate Synthetic Tabular Dataset
  • 4 Ways to Generate Passive Income Using ChatGPT
  • Generate Synthetic Time-series Data with Open-source Tools
  • How to Generate Automated PDF Documents with Python
  • What is Google AI Bard?

Google DeepMind Will Eclipse OpenAI

Google DeepMind Will Eclipse OpenAI

People have a lot of hope and expectations from OpenAI, at the same time, a lot of them are actually scared and jealous of the Microsoft-funded company. Everyone wants to have their own ‘ChatGPT moment’ in AI, but no one has been able to come even close to what OpenAI did. Google tried with Bard, and arguably, failed to deliver even with active internet connection.

Google brought back DeepMind into the picture to up the ante. OpenAI should be jittery at the moment, with its employees leaving to join Google DeepMind. Moreover, the founder of DeepMind, Demis Hassabis, recently claimed that the company’s next LLM project is going to eclipse ChatGPT. This huge claim is about the project called Gemini. Similar to GPT-4, this LLM is also expected to be multimodal and combine abilities from AlphaGo, the company’s reinforcement learning-based system that has the capabilities of planning, reasoning, and problem solving.

What exactly gives Gemini an edge over GPT?

Starting with whatever is happening around Reddit, Twitter, and Elon Musk, there is a possibility that the Tesla-head may actually go ahead and acquire Reddit. Interestingly, Musk’s plan to build a “woke ChatGPT” rival, which is possibly built on Twitter data, might include the data from Reddit as well. The reason this looks plausible is because Reddit’s CEO Steve Huffman has been praising Musk’s way of managing and handling Twitter amid the troubles that his platform is going through.

That said, DeepMind with Google has something that no one else has – YouTube. While Twitter is filled with textual data, YouTube is a gold mine for visual, audio, and textual data in almost every single language on Earth. Moreover, the videos that exist on YouTube are not just memes or reels like Twitter or Instagram, but are also filled with tutorials, lectures, and podcasts.

If Google DeepMind claims that Gemini is going to be multimodal, it might actually have the best understanding and generating model out there right now. For DeepMind, getting data from YouTube is actually the best thing it could have asked for – multimodal, multilingual, and multiregional. Moreover, DeepMind, being based out of London, actually found itself restricted by the stringent EU laws around regulating AI. Now that it is again a part of the US-based Google, it can actually benefit from the laws that govern the parent company. If we talk about the legal issues around this, interestingly, Google’s privacy policy clearly states that it has the right to use the videos for its purposes, even if the users’ have the copyright.

On the flip side, while OpenAI has been claiming that Google’s Bard was trained on ChatGPT data, there are allegations that OpenAI also used its Whisper model to train on YouTube’s data by converting audio to text. The tables have turned for OpenAI.

OpenAI isn’t scared enough

Most recently, when Altman and Ilya Sutskever visited Israel on the on-going world tour, they were met with a very direct and blunt question from someone from the audience. “What is the secret sauce of ChatGPT?”, also talking about if developers who are trying to replicate its capabilities using open source models like LLaMa, Falcon, or Vicuna would ever be able to catch up with OpenAI’s offering, or should they just give up?

During Sam Altman and Ilya Sutskever's recent trip to Israel, folks didn't beat around the bush.
First question: spicy open source vs closed source debate 🔥
Followed immediately by: "Can you tell us more about the base model before you lobotomized it?" 🤣 pic.twitter.com/fTcuT842Jv

— Bilawal Sidhu (@bilawalsidhu) June 28, 2023

To this, Sutskever said that open source is nowhere close. Even though someone would replicate the technology in the coming years, which is inevitable, OpenAI would already be a lot more ahead with what it is building.

OpenAI is proud of what it has built. When Sam Altman came to India, he was asked if a team of people would be able to build what OpenAI has built, to which he said that it is “not possible”hopeless” to think about it. The only advantage that OpenAI always had was Microsoft’s backing and the first-mover advantage, something that Google has been trying to get over for the last two years.

Hassabis’ stance on risks of AI hasn’t been clear. He earlier voted for a pause on giant AI experiments and now is building to compete with OpenAI. So, while OpenAI expands its office to London to give DeepMind some tension, DeepMind with Google is going to be trouble for Microsoft and OpenAI back in the US.

There is no way that the AI revolution is at its peak, since the leaders are still climbing high, while trying to pull the other down.

The post Google DeepMind Will Eclipse OpenAI appeared first on Analytics India Magazine.

It’s a Wrap on the Women in Data Science  Conference at Intuit

Intuit India recently concluded the Women in Data Science (WiDS) conference held at their Bangalore office on June 20. The event brought together over 80 data science professionals, who left feeling empowered and inspired by the insightful sessions that took place. WiDS featured a variety of speakers, including leading data scientists, entrepreneurs and educators.

The year marked a significant milestone as WiDS introduced an engaging mentoring session with accomplished women leaders in the data science field, giving the participants an opportunity to gain insights and tips for excelling in the industry from the leaders themselves.

The event consisted of a variety of informative tech talks, keynote sessions, fireside chats, networking sessions, and much more.

Here is a quick synopsis of the sessions –

Keynote – Data Efficient Matching

The WiDS Ambassador Note was presented by Sravyasri Garapati from Intuit, followed by an enlightening session on ‘Data Efficient Matching’ by Soma Biswas, an Associate Professor at the Indian Institute of Science (IISc) and a senior member of IEEE.

Data Efficient Matching involves the effective pairing of data across diverse sources or entities despite having limited data at hand. It tackles the difficulties posed by sparse or incomplete information by leveraging sophisticated methodologies like machine learning, probabilistic modelling, and data fusion.

The objective is to streamline the matching procedure while minimising data requirements, enabling successful matching even when only a small amount of information is accessible. This approach finds applications in various domains, such as identity resolution, recommendation systems, and data integration, where efficient matching plays a vital role in ensuring precise and dependable outcomes.

Structured Data to Analytical Text Generation

Google researcher Preksha Nema delivered an insightful speech on ‘Structured Data to Analytical Text Generation’. Nema’s research primarily focuses on understanding and generating purposeful ads, particularly in multilingual and low-resource contexts. In 2017, she received the Google PhD India Fellowship.

Structured data to analytical text generation refers to the conversion of structured data, such as organised databases or spreadsheets, into comprehensible analytical text through natural language processing (NLP).

This process employs NLP methods to translate numerical information into meaningful narratives or summaries. By extracting significant findings and trends from the structured data, this technique allows for the creation of concise and coherent textual explanations, enhancing comprehension and interpretation of the original data. It proves to be an invaluable resource for data-informed decision-making, generating reports, and effectively communicating intricate information to diverse audiences.

Role of Analytics in Data Governance

Kalapriya Kannan, representing Hewlett Packard Enterprise, discussed the ‘Role of Analytics in Data Governance’. Her expertise lies in machine learning, natural language systems, and their applications. Analytics plays a pivotal role in the realm of data governance by facilitating efficient management and utilisation of data assets. It harnesses the power of insights derived from data to establish robust policies and procedures governing data quality, integrity, and usage patterns.

Through the application of analytics, organisations can identify irregularities within data, track its origin and evolution, ensure adherence to regulatory requirements, and make well-informed decisions based on reliable and precise information.

Analytics further aids in the continuous monitoring of data governance processes, measuring performance indicators, and driving ongoing enhancements in data management practices. Ultimately, analytics empowers organisations to optimise the value of their data while upholding its integrity and security.

Fireside conversation – Gender Bias in AI – Whom Do We Blame: Data, Model, Applications or Society?

An interesting fireside chat on ‘Gender Bias in AI – Whom Do We Blame: Data, Model, Applications or Society?’ featured Kalika Bali from Microsoft Research in conversation with Shreya Mukhopadhyay

The topic delved into the attribution of responsibility regarding the presence of gender bias in artificial intelligence. It examined whether the culpability lay with the training data employed, the AI models utilised, the applications that utilise them, or the societal prejudices ingrained in the data.

This discourse highlighted the intricate interaction among these elements and underscored the importance of shared accountability in acknowledging and rectifying gender bias within AI systems. “I had the privilege of participating in the fireside chat, unsure of how the audience would respond. To my delight, numerous individuals approached me, eager to contribute and learn more about making a social impact. Their genuine curiosity and willingness to take action left me truly thrilled,” said Bali.

Responsible AI in Gaming

Rukma Talwadker from Games24x7 delivered the closing keynote on ‘Responsible AI in Gaming’. “I had an incredible opportunity to present my research on identifying excessive gameplay patterns during an amazing session. The engagement and appreciation from the audience were truly gratifying. Kudos to Intuit for organising a well-executed event with excellent session choices,” said Talwadker.

Responsible AI in gaming involves ethically and thoughtfully incorporating AI technologies into the gaming sector. It encompasses principles like equitable and inclusive portrayal, transparent algorithms, and mitigating adverse effects. Game developers aim to craft AI-infused gaming experiences that celebrate diversity, prevent bias, safeguard user privacy, and prioritise player welfare. Responsible AI in gaming strikes a harmonious equilibrium between innovation and ethical standards, ensuring that AI enriches gameplay while upholding players’ rights and values.

The conference concluded with a highly impactful mentoring session, leaving all participants enriched and empowered. With engaging discussions and meaningful connections, WiDS served as a catalyst for personal and professional growth for participants, solidifying its place as a remarkable gathering in the field.

So come join us at the next edition of the Women in Data Science Conference, where industry pioneers share cutting-edge knowledge, empowering you to thrive in a rapidly evolving landscape.

The post It’s a Wrap on the Women in Data Science Conference at Intuit appeared first on Analytics India Magazine.

Google Sheets is Now Equipped with Generative AI

Google Sheets is Now Equipped with Generative AI

The latest addition to Workspace Labs’ Duet AI features is the introduction of the “Help me organize” feature for Google Sheets.

Recently announced at I/O 2023, Google is utilising generative AI to offer suggestions and create table templates for Google Sheets, such as product roadmaps, budgets, and events. A side panel called “Help me organize” allows users to input prompts for Google Sheets.

Duet AI for Google Workspace can now help you stay organized in Google Sheets. Just describe what you want to accomplish, and Sheets will generate custom templates to help you get started. Rolling out now in #GoogleWorkspace Labs → https://t.co/0VPbhLziA0 pic.twitter.com/t7RH9haY8l

— Google Workspace (@GoogleWorkspace) June 22, 2023

Users can then insert the generated table and customise it according to their needs. This feature is designed to assist with tasks that involve complex tracking and organisation, potentially suggesting factors that were not initially considered. For example, users can ask Google Sheets to draft a trip planner or a task tracker.

The rollout of this feature has started today and is being gradually released to trusted testers in Google’s Workspace Labs program. Google Sheets is well-suited for the integration of AI, considering its frequently challenging use cases.

Read: Google’s Search Supremacy Reinforced with Generative AI

Google has been pushing generative AI in the Google Workspace with Duet AI for unleashing creativity.

For this, image generation using Imagen and other models is still being introduced to Google Slides after the initial announcement earlier this month. These additions join the existing “Help me write” feature in Gmail and Google Docs, with the latter currently available only on the web and not on mobile devices.

The post Google Sheets is Now Equipped with Generative AI appeared first on Analytics India Magazine.

Andrew Ng Releases Generative AI with LLMs Course with AWS

Andrew Ng Releases Another Generative AI with LLMs Course with AWS

Andrew Ng’s DeepLearning.AI, in partnership with Amazon Web Services (AWS), has announced an exciting new course on Coursera called “Generative AI with Large Language Models” to address the growing demand for expertise in this field.

Click here to enrol for the course.

By enrolling in this course, participants will gain a comprehensive understanding of the generative AI lifecycle based on LLMs and the underlying transformer architecture that powers them. They will learn how to effectively utilise LLMs for various tasks by selecting the most suitable model and implementing appropriate training techniques.

Apart from Andrew Ng, the instructors include Antje Barth, principal developer advocate at AWS; Chris Fregly, principal solutions architect at AWS, Shelbee Eigenbrode, principal solutions architect at AWS, and Mike Chambers, developer advocate at AWS.

The course will also cover cutting-edge methods for training, fine-tuning, inference, and deployment of models, ensuring optimal performance in real-world scenarios. Additionally, learners will acquire essential skills to navigate the evolving landscape of generative AI and effectively integrate it into their organisations and products. The course includes:

  • Data gathering: Collecting relevant data for training the generative AI model.
  • Model selection: Choosing the appropriate model architecture for the task.
  • Performance evaluation: Assessing the quality and effectiveness of the generated outputs.
  • Deployment: Implementing the generative AI model in a real-world setting.
  • Provide a detailed description of the transformer architecture that powers LLMs.
  • Explain how LLMs are trained using the transformer architecture.
  • Discuss how fine-tuning allows LLMs to be adapted to specific use cases.
  • Utilise empirical scaling laws to optimise the model’s objective function.
  • Optimise the objective function based on factors such as dataset size, compute budget, and inference requirements.
  • Apply state-of-the-art training, tuning, inference, tools, and deployment methods.
  • Use advanced techniques and tools to maximise the performance of generative AI models.
  • Consider the specific constraints and requirements of the project.

As businesses adapt to leverage the power of generative AI, the associated complexities and uncertainties surrounding this technology have become inevitable. Andrew Ng said that this course aims to demystify the subject and equip learners with the knowledge and skills required to confidently harness the potential of LLMs in their endeavours.

Andrew Ng has been very vocal about promoting people to learn and adapt to generative AI. Earlier this month, he also released three new generative AI courses with LangChain, OpenAI, and Lamini. This is after releasing a course for prompt engineering in April in partnership with OpenAI.

In December 2022, DeepLearning.AI also introduced Mathematics for Machine Learning and Data Science Specialization Course, a beginner level mathematics course for AI.

The post Andrew Ng Releases Generative AI with LLMs Course with AWS appeared first on Analytics India Magazine.

Databricks Gains MosaicML and Its Generative AI for $1.3 Billion

abstract brain artificial intelligence machine learning edge iot illustration
Image: Yingyaipumi/Adobe Stock

MosaicML will join the Databricks family in a $1.3 billion deal and provide its “factory” for building proprietary generative artificial intelligence models, Databricks announced on Monday. Companies can use AI like these to ease fears of intellectual property breaches.

The combination of Databricks’ data management technology and MosaicML’s ability to build AI models will let companies create their own large language platforms instead of relying on public generative AI such as OpenAI’s ChatGPT.

MosaicML has created two generative AI foundation models: MPT-7 (with 6.7 billion parameters) and MPT-13 (with 29.9 billion parameters). The MPT foundation models will join Databricks’ own open-source LLMs: Dolly 1 and 2.

Jump to:

  • Why Databricks chose MosaicML
  • What is Databricks?
  • Why Databricks plans on a future full of “private” AI
  • The goal is to make AI training, turning and building easier
  • Who are MosaicML’s competitors?
  • More news from the Databricks + AI Summit

Why Databricks chose MosaicML

MosaicML was the right choice for the Databricks acquisition because it has the “easiest factory on the market to use,” Databricks CEO and co-founder Ali Ghodsi said at the Databricks + AI summit on Tuesday.

He also cited a similar, competitive company culture as a reason why MosaicML was a good fit.

The acquisition is still going through regulatory approval; the deal is expected to close by the end of July. Databricks will have more information on how MosaicML’s AI training and inference products will integrate with Databricks software after that process has completed, Ghodsi said.

What is Databricks?

Databricks primarily provides data storage and data management software for enterprise organizations, as well as handles data platform migration and data analytics. Databricks has partnerships with AWS and other large enterprise software and software-as-a-service providers.

Why Databricks plans on a future full of private AI

Ghodsi pointed out that his company will use MosaicML’s resources to provide “factories” where customers can build and train LLMs to their own specifications. This means companies won’t have to shell out for application programming interface connections or share proprietary data with anyone else who uses the model; the latter has become a concern for companies using ChatGPT or Google Bard. Databricks customers will be able to choose between the Dolly and MPT families or build a custom generative AI on one of the existing models.

SEE: Tips on how to decide whether a public or private generative AI model is right for your organization (TechRepublic)

Whether to use closed source or open-source AI foundation models is the battle on everyone’s mind today, Ghodsi said. Databricks is firmly on the side of open source.

“We think that it’s better for everybody if there’s open research on understanding these models,” Ghodsi said during a Q&A session at the summit. “It’s important we understand their strengths, their weaknesses, their biases and so on.

“But we also think that, most importantly, companies want to own their own model … They don’t want to use just one model that someone has provided, because it’s intellectual property. And it’s competitive.”

Customers want to control their own IP and keep their data locked down, Ghodsi said.

Junaid Saiyed, chief technology officer of data management and analytics software company Alation, also finds customers asking about generative AI. However, it’s important for organizations to know the data they are feeding the training model is good, he said in an email to TechRepublic.

“The proliferation of data sources and increasing data volumes have made it more difficult than ever for people to search for and discover the trusted, governed data they need to train their AI models,” Saiyed said. “To be really effective, generative models must be fine-tuned on domain-specific data catalogs, and humans should review their output.”

How to decide between public or proprietary AI

Umesh Sachdev, co-founder and chief executive officer of conversational AI and automation company Uniphore, recommends enterprise leaders ask themselves the following questions when deciding whether to build their own AI on a foundation model like MosaicML’s or to use public AI like the GPT series:

  • What will the model provider cost me, and how much will infrastructure cost increase due to GPUs?
  • With regulation talks still in the early stages, how much should we lean forward? If our enterprise uses ChatGPT, are we likely to be in legal crosshairs of content providers who are legally challenging the ownership or training of the data?
  • If we don’t want to use something that was trained on public or open data but more proprietary datasets from our own industry, we might ask whether all of our data is ready in one place.
  • If the pilot we do succeeds, will it scale? What about connecting all our legacy systems to this AI layer?

The goal is to make AI training, turning and building easier

“For most organizations, they have specialized tasks that they want to do … and for that, we want them to be able to train and tune specific models,” Ghodsi said at the Databricks + AI summit.

Enterprise customers need a certain threshold of technical skill to build generative AI, Ghodsi said. He anticipates that MosaicML can fill a need for an easier way to build and train AI technology.

“Hopefully, eventually, we’ll make it something you can do with a few clicks,” Ghodsi said at the summit.

“This technology (generative AI) is in its nascency, and a lot needs to be uncovered about data sovereignty, scalability and cost,” said Sachdev in an email to TechRepublic. “Companies are moving fast to make announcements and decisions, but like most big tech waves, the opportunities will unfold in the second or third wave of development.”

“This AI transformation is revealing to business and technology leaders what the true state of their data environment is,” Saiyed said. “Organizations with a data intelligence platform and federated data governance will be able to leverage the power of GenAI before those that are only now investing in modernizing [their] data management strategy.”

Who are MosaicML’s competitors?

Competition in the area of AI training is fierce; MosaicML competes with NVIDIA, OpenAI, Anthropic and Google. On Monday, NVIDIA announced a partnership with Snowflake to add the NVIDIA NeMo LLM development platform and NVIDIA GPU-accelerated computing to the Snowflake Data Cloud.

More news from the Databricks + AI Summit

Four other major updates came out of the Databricks + AI summit:

  • The Delta Lake open-source storage framework will now be available in version 3.0, which adds Universal Format (UniForm), Kernel for Delta connectors and Liquid Clustering data layouts for easier access.
  • LakehouseIQ is a natural language chat AI running in the Databricks Unity Catalog.
  • Lakehouse AI is a toolkit for LLMs on the Lakehouse data platform;
  • Lakehouse Federation is a tool to unify previously siloed data mesh architecture.

woman working with data on laptop

Subscribe to the Data Insider Newsletter

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Delivered Mondays and Thursdays Sign up today

A New OpenAI Competitor Arrives

A New OpenAI Competitor Arrives

Competing with what OpenAI has achieved in just a few months is hard. Even then, a lot of VCs are funding the competitors. Moreover, a lot of startups working in the hiding to avoid being compared with the Microsoft-funded giant are slowly crawling out in the open.

Most recently, Reka, an AI startup founded by former Google, DeepMind, Baidu, Meta, and Microsoft researchers, announced that it now wants to emerge out of the stealth mode and unveiled its series A funding of $58 million. The funding round was led by DST Global Partners and Radical Ventures, along with Snowflake Ventures. Former GitHub CEO Nat Friedman also took part in the investment after his recent investment in ElevenLabs.

Yi Tay, one of the founders of the Reka AI Labs said that the company is still in the early stages of this new AI revolution and wants to be part of the innovations in the field. He highlighted the two goals of the company — to build generative models and push for frontiers in AI research — and plan to use this fund to work towards that.

We’re coming out of stealth with $58M in funding to build generative models and advance AI research at @RekaAILabs 🔥🚀
Language models and their multimodal counterparts are already ubiquitous and massively impactful everywhere.
That said, we are still at the beginning of this… pic.twitter.com/uDI8nWI7Iq

— Yi Tay (@YiTayML) June 27, 2023

The research focused founders are motivated to work on what they call ‘universal intelligence”, which means general-purpose multimodal and multilingual agents, which are also self-improving AI models, while designing them for specifically enterprise softwares. Reka is also hiring for both – technical and non-technical roles.

Where does it stand?

Similar to OpenAI’s mentioned mission of “benefitting of humanity”, Reka’s mission statement says, “build generative AI models for the benefit of humanity, organisations, and enterprises”. Interestingly, the company’s head office is also based in San Francisco. This definitely gives it an advantage over the recently funded Mistral AI startup based in Paris, which might struggle with the EU’s stringent AI policies like GDPR. Mistral.AI’s mission is also similar to Reka’s — to build generative AI which would be beneficial for enterprise. Though it is not yet clear if the company advocates for open source.

Read: This AI Startup from Paris Raises Highest Seed Funding Ever

When it comes to the founders of the company, the expertise is clearly visible. Tay, Dani Yogatama, Qi Liu, and Cyprien de Masson d’Autume, have worked for big projects at Google, Microsoft, and Meta, including DeepMind’s AlphaCode, Bard, and Gopher. The founders realised while working on these projects that expecting to build an all encompassing LLM for all possible use cases is not practical.

Yogatama told TechCrunch, “We understand the transformative power of AI and would like to bring the benefits of this technology to the world in a responsible way.” The company shows interest in building smaller models, instead of larger ones like GPT-4, that can be incorporated for different use cases.

Rob Toews from Radical Ventures also told Reuters that, “I think small models are going to represent a massive paradigm shift as enterprises are getting more serious about deploying AI models at scale.” The vision of the investors and the founders are aligned.

What’s the future?

The company’s only product, which is still in beta testing, is Yasa, a multimodal AI assistant for images, videos, and understanding tabular data, something very similar to what OpenAI is building with GPT-4. Users can insert their proprietary data on the bot and it will derive insights from it, for which the company also provides its API. This is something that OpenAI has been trying to push with plugins and new updates to its APIs.

Reka isn’t generating revenue yet, according to Yogatama. Using the funding, the company aims to acquire computing power from NVIDIA. This is clearly an aim to run for the big guns. To start, Snowflake Computing, the company which recently also partnered with NVIDIA, is allowing users to use Reka on its server. Christian Kleinerman, senior vice president of product at Snowflake, told Reuters that the company’s partnership is to guarantee its users privacy while using such AI models. This announcement also just comes after Databricks, a Snowflake competitor, acquired MosaicML in a $1.3 billion deal.

Only time will tell if any startup can outshine OpenAI. India is clearly very far off. Mistral AI would face restrictive challenges in Europe. Reka still has to prove what it can do. Competition is only going to go upwards with Y Combinator announcing that one-third of its latest batch of companies are specifically focused on building AI products, that too specially in the hub of OpenAI, San Francisco. For now, OpenAI needs to be careful, as a lot of its members are also reportedly leaving the company to join Google – its biggest rival.

The post A New OpenAI Competitor Arrives appeared first on Analytics India Magazine.

US Legitimises AI Race with China, Calls to Prioritise Democratic Values

The AI arms race between America and China, being touted as Civil War 2, is heating up again in the backdrop of OpenAI’s ChatGPT that caught the imagination of people, kicking off the mad rush not just in the US, but globally.

The US media has been selling the narrative of a “new cold war arms race” over AI since 2017 ever since China outlined its strategic objectives and roadmap for becoming a global leader in AI by 2030. However, a report by the European Commission has revealed that while the US currently holds the lead, with a strong AI startup ecosystem, expertise in traditional semiconductors, and high-quality research, China has made strides and significant progress in data availability and AI adoption, racing ahead of the US and EU in these categories. The report compared the US, EU and China’s standing in AI based on examining six categories of metrics—talent, research, development, adoption, data, and hardware.

According to China’s plan, which was divided into three stages, China aimed to catch up with leading AI countries in terms of technology and applications by 2020. The country seems to be pursuing that; it has directed all its resources towards competing with American giants like Google and Microsoft, in a rivalry re-ignited with the advent of a series of advancements in generative AI.

During the recent US House Committee on Science, Space, & Technology’s hearing on, ‘Advancing Innovation Towards the National Interest’, chairman Frank Lucas opened with a question and a concern about China narrowing the gap when it comes to AI research and innovation. He cited a Stanford research which indicates that nine of the top 10 universities based on the number of AI papers they published were based in China. Standing alone on the 10th spot was US-based MIT.

Trouble mounts

China’s tech industry has faced challenges in the past due to US tech sanctions, regulatory demands, and Western distrust, which have limited the international expansion of Chinese tech companies. Recently, there has been a buzz again that the Commerce Department is considering banning sales of AI chips to China unless US companies get a special license. As a result of the news, stocks of NVIDIA and AMD slumped. Despite these obstacles, Chinese companies are determined to catch up in the AI race.

The company could find a workaround in supplying a version of products that avoided the ban like the A800. Chinese execs also feel that piling on more chipsets can solve the issues arising from these bans. The country has directed all its resources towards competing with American giants like Google and Microsoft, in a rivalry re-ignited by OpenAI’s ChatGPT.

China’s investment in AI is rapidly increasing, although it still lags behind the US. In 2023, an estimated $15 billion will be spent on AI technology in China compared to $26.6 billion in the US. However, the Chinese government recognises the critical importance of AI for maintaining the country’s technological dominance and is likely to mobilise resources to drive advancements. The number of Chinese venture deals in AI is on the rise, and have surpassed those in consumer tech.

Chinese entrepreneurs, engineers, and former employees of major tech companies are driven by the ambition to surpass their US counterparts in AI, which is seen as a technology that could determine global power dynamics.

Leading Chinese tech players such as Baidu, Alibaba, and SenseTime are already deploying AI bots like Ernie bot, Tongyi Qianwen, and SenseChat. The Chinese AI ecosystem includes former employees of major tech companies and prominent venture capitalists who are backing AI startups. Prominent figures in the industry, such as Wang Changhu, the former director of ByteDance’s AI Lab; Zhou Bowen, the ex-president of JD.com Inc.’s AI and cloud computing division; Wang Huiwen and Wang Xing from Meituan, and venture capitalist Kai-fu Lee, known for his support of Baidu, are among those joining this effort.

However, Chinese demos of AI technology show there is still a long way to go, and critics argue that true innovation requires the free-wheeling exploration and experimentation that is more prevalent in the US. Hugging face CEO, Clement Delangue testifying during the hearing in US recently also vouched that the US ecosystem has allowed for the growth of Hugging Face which is a big proponent of open source innovation and research in AI. “I believe we could not have created this company anywhere else I am leaving proof that the openness and culture of innovation in the US allows for such a story to happen,” he said.

He also emphasised that to maintain its leadership position in AI compared to other countries, open science and open-source initiatives are crucial. They argue that these approaches align with American values and interests, and they emphasise the need to incentivise and promote openness in AI research and development across all companies.

On the other hand, Chinese AI initiatives are challenged by censorship and flawed datasets due to restrictions on information flow and expression. The strict censorship regime in China raises concerns about the impact on the development of AI and the scale of searchable information. China has the lowest Internet Freedom Score, indicating significant obstacles to access and limits on content. The Chinese government’s new draft rules on chatbots emphasise the need for content to align with socialist values, further restricting AI-generated information. Controlling AI-generated content through censorship poses challenges, resulting in disappointing outcomes for Chinese generative AI efforts.

Real Life Consequences

This race is not just about technological supremacy but many analysts and executives predict that AI will have a profound impact on shaping future technology leaders, similar to how the internet and smartphones gave rise to global titans. They also assert that AI has the potential to drive advancements in various fields, ranging from supercomputing to military capabilities, potentially influencing the geopolitical balance between nations.

Citing the use of AI in Ukraine war, chairman Frank Luca said, “It is in our national interest to ensure the United States has a robust Innovation pipeline that supports fundamental research all the way through to real world applications the country that leads in commercial and military applications will have a decisive advantage in global economic and geopolitical competition.”

Harvard Economics Professor David Yang highlighted and cautioned at China’s significant success in AI evidenced by Chinese companies producing the most accurate facial recognition technology.

Additionally, the US-China AI race is also being seen from a democracy vs autocracy point of view. It has been seen that autocratic regimes and weak democracies show a particular interest in AI, with an increase in purchases following political unrest and protests to predict and control the behaviours of their citizens, aligning their interests with AI technology.

The post US Legitimises AI Race with China, Calls to Prioritise Democratic Values appeared first on Analytics India Magazine.

Canva Can’t Keep Up With Adobe Anymore

What low-code and no-code did for programming, Canva did for designing. Budding designers no longer had to spend hours learning complex Adobe software. This led to Canva making it big in the design space with over 100 million users flocking their platform for their design needs.

This took a big chunk of market share from Adobe, prompting them to respond with Adobe Express, a direct competitor to Canva. This web-based application was launched in 2015 as an iOS application, later being launched on desktop as Adobe Spark. However, with its latest update, Express has gone beyond being just a Canva competitor.

Adobe is bringing Firefly, its generative AI service, to Adobe Express. Powered by NVIDIA’s Picasso suite of generative AI models, this update blows Canva’s OpenAI-powered Magic Design out of the water. After this move, Canva is yet again left on the back foot to catch up to Adobe.

Design giants square up

To understand the impact of each of these giants’ generative AI moves, we must first delve deeper into the features offered by the respective applications. Canva boasts a handful of AI features, which are stated to be powered by OpenAI’s algorithms. This includes DALL-E for image generation and GPT-like models for text generation.

Using these algorithms, Canva has integrated text to image capabilities, a text copy generator, and a GenAI powered design tool called Magic Design. In addition to these features, the platform has also added on a host of AI-powered quality-of-life features.

Adobe, on the other hand, is going for a more comprehensive content strategy with a focus on animation and video. Pointed especially at creating content for TikTok and Instagram Reels, Express features an all-in-one editor with AI hidden behind every corner. It can not only create images, but also videos, animations, and even documents.

Adobe Express’ generative AI features include text-to-image through Firefly, along with the ability to generate what Adobe calls text-effects. Also included are AI-powered features to find the perfect addition to the given content, to automatically resize creatives, and to get personalised AI recommendations.

Adobe’s generated images have the unique advantage of being completely free from any copyright issues, as they have been trained on Adobe Stock’s corpus of licensed images. Express also has the advantage of being plugged into Adobe’s ecosystem, boasting integration with Photoshop, Illustrator, Acrobat and Premiere Pro, along with free access to over 22,000 fonts through Adobe TypeKit.

While both seem evenly matched on the surface level, there are a few shortcomings that Canva might not be able to surpass to beat out Adobe. Firefly was one of the biggest launches for Adobe, with the beta alone generating 100 million assets in just a few days. With the eventual integration of Firefly into Adobe’s Creative Cloud, it seems the clock has begun ticking for Canva.

Canva begins to fall behind

When comparing the outputs from Adobe Firefly and Canva’s text-to-image feature, the quality difference is obvious. When given a prompt to create a Japanese flower garden with elegant bridges and a waterfall, these were the options provided by Firefly and Canva.

Canva (L) vs. Adobe Firefly (R)

As we can see, there is a very apparent gap in the capabilities of both these algorithms. While both of them still fall behind when compared to standalone algorithms like Stable Diffusion and Midjourney, Adobe definitely wins out over Canva.

One of the biggest unseen issues with Canva’s image generator is the lack of clarity on the copyright status of the images. In the FAQ section of its image generator, Canva states, “The treatment of AI-generated images and other works under copyright law is an open question and the answer may vary depending on what country you live in. However, please note that this does not mean you are the copyright owner of the images”

This is most likely due to the fact that the model behind Canva’s text-to-image is Stable Diffusion; an AI model mired in copyright issues. Adobe, on the other hand, has not only absolved itself of any copyright issues, but is also actively working towards the responsible implementation of AI art through its content authenticity initiative. In a blog delving into Firefly, Adobe stated, “Firefly’s first model is trained on a unique dataset that includes Adobe Stock images, openly licensed content and other public domain content without copyright restrictions.”

While Canva enjoyed a period of relative success over the past decade, it seems that Adobe is more than eager to take back the throne from the upstart. By continuing to flesh out Adobe Express and integrate it with the Creative Cloud, Adobe has slowly begun a strategy to take over from Canva. Building on their existing user base of 30 million Creative Cloud users, Express’ generative AI powered editing might become the spearhead of Adobe’s freemium model. Canva seems to be doing all they can to hold on to their sizeable user base, but thanks to Firefly, that might not be enough to keep it afloat.

The post Canva Can’t Keep Up With Adobe Anymore appeared first on Analytics India Magazine.

Nvidia CEO Huang: Get Ready for Software 3.0

Nvidia CEO Huang: Get Ready for Software 3.0 June 28, 2023 by Agam Shah

Nvidia's CEO Jensen Huang said that artificial intelligence is ushering in the era of Software 3.0, where creating and running applications will be as simple as writing queries into a universal AI translator, running a few lines of Python code, and selecting an AI model of your choice.

"That's the reinvention of the whole stack — the processor is different, the operating system is different, the large language model is different. The way you write AI applications is different… Software 3.0, you do not have to write it at all," Huang said at a fireside chat during the Snowflake Summit this week.

Huang talked about the emerging software landscape as the company switches gears to a software-sells-hardware strategy, a complete flip of its past hardware-sells-software strategy. Nvidia hopes to sell more software that runs only on its GPUs.
Software 3.0 applications will change the way users interact with computers, Huang said, adding that the interface will be a universal query engine "that's super intelligent and you can get it to … respond to you."

Users can type in prompts and context at the query engine, which goes through large language models, which may be connected to corporate databases or other data sources. ChatGPT is an early iteration of how this system will work, but Huang said this will impact every facet of computing.

The Software 3.0 concept relies on a new structure of data, algorithms, and compute engines, Huang said, adding that instead of command lines, users will be able to talk databases and "ask it all kinds of questions about what, how, when and why."
He gave one example of ChatPDF, which analyzes and summarizes giant PDF documents. Large language models could also generate programming code if needed.

Nvidia CEO Jensen Huang

"We'll develop our own applications, everybody's going to be an application developer," Huang said, adding that conventional programs in companies will be replaced by hundreds of thousands of AI applications.

It is the early days of this new type of computing, which is a departure from the old style of computing that relied on bringing data through computers and processing it via CPUs. The entire structure of computing is untenable with the inability to scale performance.

The Software 3.0 approach will merge data from multimodal sources that include images, text, and voice. Huang said, added that "for the very first time you could develop a large language model, stick it in front of your data and you talk to your data… like you talk to a person."

Startups like Glean and Neeva (which was acquired by Snowflake) are investing in technologies that connect AI search within enterprises to large language models. On a consumer front, Microsoft and Google are sending queries from search to supercomputers with AI chips that process the queries and return a response.

Nvidia's strategy is to provide the hardware and software on both ends – the consumers and enterprises – to run artificial intelligence applications. Nvidia's involvement right now is mostly covert, but ChatGPT relies heavily on Nvidia GPUs to process queries.

Applications developed using LangChain, and intermediate agents and data sources can be added in betwe

en AI processing to provide more fine-tuned responses.

One such intermediary is Nvidia's NeMo Guardrails, which eliminates chatbot hallucinations so large-language models stay on track and provide relevant answers to queries. Huang also bet on large-language models with billions of parameters to make AI relevant, liking it to a college grad that was pre-trained to be super smart. The large models will be surrounded by smaller models augmented by specialized knowledge, which could support enterprises.

Huang estimates the new AI landscape will slowly disassemble the older software and hardware stack. Microsoft and Intel thrived with Windows and x86 chips on conventional desktop PCs, and Google thrived in the Internet era with search.

Microsoft and Google are already blending their old computing models by plugging their own large-language AI models into applications. Microsoft has a fork of GPT-4 powering Bing, while Google has the PaLM-2 transformer, and is also developing Gemini, which is still being trained.

Nvidia’s future is in Software 3.0 concept, with the main computing hardware being its GPUs. Nvidia saw the AI opportunity many years ago, and has invested heavily in developing a complete AI stack — including software, services, and hardware — to chase the opportunity, said Jim McGregor, principal analyst at Tirias Research.

The company's AI operating system is the AI Enterprise Suite, which includes large language models like NeMo, compilers, libraries, and development stacks. The software developed via AI Enterprise will need Nvidia's GPUs, which can be found on-premise or in the cloud.

At this week's Snowflake Summit, Nvidia announced software partnerships that provided clarity on how it would lock customers into using its software and GPUs in the cloud.

Nvidia said it was bringing its NeMo large-language model to Snowflake Data Cloud, which is used by top organizations to deposit data. The NeMo LLM is a pre-trained model in which companies can feed their own data to create their own models. Enterprises can generate their own tokens and customize the models, and queries to the database will deliver more fine-tuned answers. For example, employees could generate an expense report for a specific quarter from a prompt.

Nvidia's NeMo transformer model is trained from a generic corpus of data, and companies will augment the models with their own data. The proprietary corporate data will remain locked in their model, and will not be sent back to the larger models, said Manuvir Das, vice president for enterprise computing at the company, during a press briefing.

Users of the Snowflake Data Cloud will be able to connect the software to hardware on cloud service providers, which have set up their own supercomputers with Nvidia's GPU. Google a few months ago announced the A3 supercomputer, which has 26,000 Nvidia GPUs. Microsoft has its own Azure supercomputer with thousands of Nvidia GPUs.

Nvidia is also providing the ability for third-party customers to use large language models and smaller custom models via a partnership with ServiceNow, which was announced earlier this year. In this partnership, ServiceNow is using NeMo to create models for their customers. But the software-as-a-service company also provides access to other AI models such as OpenAI's GPT-4, giving flexibility for customers to use GPT-4 instead of Nvidia's NeMo.

ServiceNow also provides connectors that provide customers access to many AI options. For example, Glean, which uses multiple LLMs, integrates with ServiceNow.

Nvidia is a top player in the AI market, and its main hardware competitors, Advanced Micro Devices and Intel, are far behind with no proven commercial success.

AMD this month introduced a new GPU, the Instinct MI300X, which is targeted at AI but has no clear software strategy with its focus squarely on hardware. Tirias Research’s McGregor said that AMD was late to the game, and as a smaller company does not have the resources to pour into software.

Intel has many AI chips in its portfolio, including the Gaudi2 AI chip and Xeon GPU Max for training, but those chips are still not being sold in volume. Intel's contrasting software strategy revolves around an open approach so developers can write AI applications that can run on any hardware.

Related