GenAI Service Market to Grow at 45% CAGR by 2033 

With the rise of generative AI over the last year and a half, its impact on company revenues has become increasingly evident. As per a forthcoming AIM Research report, the total market for generative AI is expected to grow at a compound annual growth rate (CAGR) of 40.22% by 2033 as compared to 2023.

Furthermore, companies in the service sector are anticipated to experience a slightly higher CAGR of 45%, indicating a faster GenAI expansion compared to the overall market.

GenAI Revenues Galore

Most recently, NVIDIA reported a revenue of $26.04 billion in Q1 2025, fuelled by the generative AI boom. In contrast, during the first quarter of 2024, NVIDIA’s revenue was $7.19 billion.

NVIDIA is followed by AMD in the data centre market, with AMD holding a 3% share. Total spending in this market reached $49 billion in 2023, a significant increase from $17 billion in 2022, according to AIM Research.

While Google, Microsoft, AWS, Apple, and Meta, alongside Intel and AMD, are investing heavily in developing custom chips to run AI workloads, experts believe they will never be able to catch up with NVIDIA, as it controls a whopping 95% of the AI chip market.

Cloud companies are at the forefront of benefitting from generative AI. Microsoft Azure has shown steady growth in the cloud sector, increasing its market share.

In the recent quarter, Azure’s cloud revenue was $26.7 billion, up 23% year-on-year (YoY). Google Cloud reported revenue of $9.6 billion, up 28.43% YoY. On the other hand, AWS reported revenue of $25.04 billion in Q1 2025, up 13% from the fourth quarter of the previous year.

IT Companies

Generative AI has contributed to the revenue growth of Indian IT companies by creating new business opportunities and enhancing productivity. For instance, TCS reported GenAI deals worth $900 million in Q4 FY24.

Meanwhile, Tech Mahindra achieved a 27% increase in revenue per advertiser for a large retail e-commerce client in the SMB segment. The company implemented generative AI in revenue operations (RevOps) to optimise ad campaigns and enhance customer satisfaction.

Wipro is also feeling confident about its generative AI solutions. In the latest quarter, the company launched the Wipro Enterprise Artificial Intelligence Ready platform in partnership with IBM, expanding on a substantial investment in AI.

Infosys revealed in its Q4 FY24 quarterly reports that the IT giant is seeing excellent traction with its clients for generative AI work. The company generated over 3 million lines of code using generative AI and large language models available in the public domain.

Global IT firm Accenture secured multiple GenAI projects worth $600 million in the last quarter, building upon the $450 million in projects secured in the preceding quarter. The company’s planned investment of $3 billion aims to enhance its capabilities and cement its position as a top service provider.

Genpact, on the other hand, reported a 4% increase in Q1 2024 revenues to $1.13 billion, with significant execution improvements in data, generative AI, and digital operations.

Data and Analytics

Snowflake recently announced that their quarterly revenue was $790 million, up 34% year-over-year. The company has announced that its product revenue is projected to be between $805 million and $810 million for the current quarter ending in July, also up 34% year-over-year. In contrast, the total revenue in Q1 2021 was $228.9 million.

Since Sridhar Ramaswamy assumed the CEO position, Snowflake has transformed from a data cloud company to a data and AI-driven entity with a strong emphasis on generative AI.

“I think it’s a huge opportunity in the world of data applications and AI. It will keep me busy for many years to come,” said Ramaswamy in a recent interview after taking the helm at Snowflake.

On the other hand, in 2024, Databricks reached a revenue run rate of $1.9 billion, with a year-over-year growth rate of 26.67%. The company reported $1.6 billion in revenue for the fiscal year ending January 31, 2024, representing over 50% year-over-year growth.

Databricks’ India arm has recorded an 80% annualised growth over the past two fiscal years, from February 1, 2022, to January 31, 2024. The company attributes this surge to the rising demand for data and AI capabilities among Indian enterprises.

Indian enterprises, including Air India, Aditya Birla Fashion and Retail, CommerceIQ, Freshworks, InMobi, Meesho, Myntra, Parle, and UPL, use Databrick’s platform.

BFSI Sector

Banks that move quickly to scale generative AI across their organisations could increase their revenues by up to 600 basis points (bps) in three years, according to Accenture research. The analysis found that banks that effectively adopt and scale generative AI could increase employee productivity by up to 30%, streamlining numerous language-related tasks.

One notable example of generative AI increasing a bank’s revenue is Wells Fargo’s implementation of its generative AI virtual assistant, named Fargo. Launched in March 2023, Fargo has handled 20 million interactions and is projected to reach 100 million interactions annually.

Fargo is built on Google’s PaLM 2 and can answer everyday banking queries, provide insights into spending patterns, check credit scores, pay bills, and offer transaction details.

Pharma

Moderna reported a revenue of $167 million in Q1 2024, down significantly from the $2.8 billion earned in the previous quarter due to a drop in sales of its COVID-19 vaccine, Spikevax. Following that, the pharma giant partnered with OpenAI to use ChatGPT Enterprise for mRNA medicine development, aiming to launch up to 15 new products in the next five years, including a vaccine for respiratory syncytial virus and personalised cancer treatments.

Of all the AI startups, OpenAI is leading the way. It reached the $2 billion revenue milestone in December, according to a report by the Financial Times. The report indicated that OpenAI expects to more than double this figure by 2025, driven by a strong interest from business customers looking to implement generative AI tools in the workplace.

Chinese investor and entrepreneur Kai-Fu Lee is bullish about OpenAI becoming a trillion-dollar company in the next two to three years. “OpenAI will likely be a trillion-dollar company in the not-too-distant future,” said Lee at a recent event with Fortune.

The post GenAI Service Market to Grow at 45% CAGR by 2033 appeared first on AIM.

How to Use GPT for Generating Creative Content with Hugging Face Transformers

How to Use GPT for Generating Creative Content with Hugging Face Transformers

Introduction

GPT, short for Generative Pre-trained Transformer, is a family of transformer-based language models. Known as an example of an early transformer-based model capable of generating coherent text, OpenAI's GPT-2 was one of the initial triumphs of its kind, and can be used as a tool for a variety of applications, including helping write content in a more creative way. The Hugging Face Transformers library is a library of pretrained models that simplifies working with these sophisticated language models.

The generation of creative content could be valuable, for example, in the world of data science and machine learning, where it might be used in a variety of ways to spruce up dull reports, create synthetic data, or simply help to guide the telling of a more interesting story. This tutorial will guide you through using GPT-2 with the Hugging Face Transformers library to generate creative content. Note that we use the GPT-2 model here for its simplicity and manageable size, but swapping it out for another generative model will follow the same steps.

Setting Up the Environment

Before getting started, we need to set up our environment. This will involve installing and importing the necessary libraries and importing the required packages.

Install the necessary libraries:

pip install transformers torch

Import the required packages:

from transformers import AutoModelForCausalLM, AutoTokenizer  import torch

You can learn about Huging Face Auto Classes and AutoModels here. Moving on.

Loading the Model and Tokenizer

Next, we will load the model and tokenizer in our script. The model in this case is GPT-2, while the tokenizer is responsible for converting text into a format that the model can understand.

model_name = "gpt2"  model = AutoModelForCausalLM.from_pretrained(model_name)  tokenizer = AutoTokenizer.from_pretrained(model_name)

Note that changing the model_name above can swap in different Hugging Face language models.

Preparing Input Text for Generation

In order to have our model generate text, we need to provide the model with an initial input, or prompt. This prompt will be tokenized by the tokenizer.

prompt = "Once upon a time in Detroit, "  input_ids = tokenizer(prompt, return_tensors="pt").input_ids

Note that the return_tensors='pt' argument ensures that PyTorch tensors are returned.

Generating Creative Content

Once the input text has been tokenized and prepared for input into the model, we can then use the model to generate creative content.

gen_tokens = model.generate(input_ids, do_sample=True, max_length=100, pad_token_id=tokenizer.eos_token_id)  gen_text = tokenizer.batch_decode(gen_tokens)[0]  print(gen_text)

Customizing Generation with Advanced Settings

For added creativity, we can adjust the temperature and use top-k sampling and top-p (nucleus) sampling.

Adjusting the temperature:

gen_tokens = model.generate(input_ids,                               do_sample=True,                               max_length=100,                               temperature=0.7,                               pad_token_id=tokenizer.eos_token_id)  gen_text = tokenizer.batch_decode(gen_tokens)[0]  print(gen_text)

Using top-k sampling and top-p sampling:

gen_tokens = model.generate(input_ids,                               do_sample=True,                               max_length=100,                               top_k=50,                               top_p=0.95,                               pad_token_id=tokenizer.eos_token_id)  gen_text = tokenizer.batch_decode(gen_tokens)[0]  print(gen_text)

Practical Examples of Creative Content Generation

Here are some practical examples of using GPT-2 to generate creative content.

# Example: Generating story beginnings  story_prompt = "In a world where AI contgrols everything, "  input_ids = tokenizer(story_prompt, return_tensors="pt").input_ids  gen_tokens = model.generate(input_ids,                               do_sample=True,                               max_length=150,                               temperature=0.4,                               top_k=50,                               top_p=0.95,                               pad_token_id=tokenizer.eos_token_id)  story_text = tokenizer.batch_decode(gen_tokens)[0]  print(story_text)    # Example: Creating poetry lines  poetry_prompt = "Glimmers of hope rise from the ashes of forgotten tales, "  input_ids = tokenizer(poetry_prompt, return_tensors="pt").input_ids  gen_tokens = model.generate(input_ids,                               do_sample=True,                               max_length=50,                               temperature=0.7,                               pad_token_id=tokenizer.eos_token_id)  poetry_text = tokenizer.batch_decode(gen_tokens)[0]  print(poetry_text)

Summary

Experimenting with different parameters and settings can significantly impact the quality and creativity of the generated content. GPT, especially the newer versions of which we are all aware, has tremendous potential in creative fields, enabling data scientists to generate engaging narratives, synthetic data, and more. For further reading, consider exploring the Hugging Face documentation and other resources to deepen your understanding and expand your skills.

By following this guide, you should now be able to harness the power of GPT-3 and Hugging Face Transformers to generate creative content for various applications in data science and beyond.

For additional information on these topics, check out the following resources:

  • Hugging Face Transformers Documentation
  • PyTorch Documentation
  • Generative Pre-trained Transformer (Wikipedia)

Matthew Mayo (@mattmayo13) holds a Master's degree in computer science and a graduate diploma in data mining. As Managing Editor, Matthew aims to make complex data science concepts accessible. His professional interests include natural language processing, machine learning algorithms, and exploring emerging AI. He is driven by a mission to democratize knowledge in the data science community. Matthew has been coding since he was 6 years old.

More On This Topic

  • How to Fine-Tune BERT for Sentiment Analysis with Hugging Face Transformers
  • How to Use Hugging Face AutoTrain to Fine-tune LLMs
  • Surpassing Trillion Parameters and GPT-3 with Switch Transformers -…
  • Training BPE, WordPiece, and Unigram Tokenizers from Scratch using…
  • Top 10 Machine Learning Demos: Hugging Face Spaces Edition
  • A community developing a Hugging Face for customer data modeling

Microsoft Edge will use AI to translate YouTube videos while you watch

Microsoft Edge's translation feature

Microsoft Edge will soon let you watch and listen to certain online videos in other languages. At its Build conference on Tuesday, Microsoft announced that the new AI-powered feature will translate videos in real time on YouTube as well as a range of other sites.

On a Microsoft Edge features page, the company revealed more details about the real-time translation. To start, the feature will work with videos on YouTube, Reuters, CNBC News, Bloomberg, Money Control, LinkedIn, and Coursera. Microsoft said that it's looking to support other sites in the future.

Also: Microsoft's latest Windows 11 security features aim to make it 'more secure out of the box'

For now, the number of supported languages is limited. Audio sources in Spanish and Korean can be translated into English, while audio in English can be translated into Hindi, German, Italian, Spanish, and Russian. Microsoft plans to add more languages after rolling out the feature.

To address privacy concerns, Microsoft said that the translation will occur completely on your PC or mobile phone. The company promises that no data or content from the video leaves your device or gets processed in the cloud.

Although Microsoft strives to ensure the translations are accurate, the usual flaws and fallibilities found in AI may pop up. In particular, the translation is likely to be affected by such factors as the source language, the number of speakers, and any background music.

A video clip of Microsoft CEO Satya Nadella on the Edge features page shows how the real-time translation would work. Hovering over a supported video would display a small toolbar with a Translate icon. Clicking the icon would let you select the source and target languages and opt to display subtitles.

After you click the Translate button, the video pauses as the audio is translated on your device. Once the translation is available, the video resumes using the target language you selected. The subtitles also appear at the bottom if you choose that option.

Also: 3 AI features coming to Copilot+ PCs that I wish were on my MacBook

Microsoft has been aggressively infusing its products and services with artificial intelligence — and Edge has been one recipient of this push. Microsoft's browser offers a Copilot sidebar through which you can access the AI tool to ask questions, find information, and generate content.

Real-time translation is another AI-based capability in Microsoft's wheelhouse. Promoting the new and upcoming Copilot+ PCs at Build this week, the company touted a caption feature that will display text in English for any audio across several applications and video platforms. Supporting more than 40 languages, the captions can appear in apps and services such as Microsoft Teams, Zoom, Chrome, and Netflix.

Meet the College Dropout Who Helped OpenAI Build Sora 

Meet the College Dropout Who Helped OpenAI Build Sora

OpenAI is always on the hunt for young exceptional talent. Just like the company hired Prafulla Dhariwal, who helped CEO Sam Altman create GPT-4o, OpenAI also brought onboard 20-year-old Will DePue, a college dropout, who was one of the major contributors to Sora, the text-to-video tool that generates life-like, hyper-realistic video footage.

Last week, DePue turned 21 and said, “I want to be careful to have a healthy relationship with age (something tech is famously bad at). Just trying to compare where I’m now to a year/two years ago, I do think that I need to take care in continuing to push myself to grow as much as I’ve done in the past.”

DePue joined OpenAI 11 months ago as a member of the technical staff with OpenAI’s residency programme where he worked on jailbreaking and prompt injection mitigation, along with model capability evaluations and fine-tuning.

He is possibly joining 1X, a robotics company funded by OpenAI, Tiger Global, Everon, and NVIDIA.

I love this video. You're one of the first <10,000 people in human history to experience externally anthropomorphically interacting with yourself pic.twitter.com/ePZWjdDDIK

— Mehran Jalali (@mehran__jalali) May 23, 2024

In a podcast, DePue said that he joined OpenAI by basically texting every one of his connections. “I don’t have a traditional background. I just dropped out of college, built startups, and started working on projects then,” said DePue, adding that he did not care about which position he joined as long as he got to learn something.

One of the stand out projects that DePue worked on was WebGPT, where he built a package to run GPT models entirely on the Chrome browser. He credited Andrej Karpathy for teaching him about Transformers, which he had no idea about before starting the project.

The dropout prodigy

Even though DePue was always into tech, he said that he was mostly interested in hardware. He never started coding until he turned 17. One of his friends predicted the startup market crash in 2008, which led him to move to Argentina. He stayed there for five years.

Born in Seattle, DePue went to school at Geffen Academy at UCLA and later went on to study BS in computer science at the University of Michigan, from where he eventually dropped out to start building things. He previously founded DeepResearch and Thrive.fyi, a Discord bot.

Homeschooled until the sixth grade, DePue said that he did not like the idea of school much. “You sit down all the time and learn from a board, don’t move, don’t yell at kids, and don’t poke your friends, it is not something that people are super good at,” he said.

However, he does admit that schools are really important. “I don’t think I was doing it for the right reasons,” he said. DePue was also an Eagle Scout with the Boy Scouts of America.

DePue believes that one of the most important things is relative competition. “We all started in different fields,” DePue explained, saying that he started with building Discord and Minecraft servers, and his friends who were doing something else, and are now running Series A and above companies.

With a massive following on X and building the computer on Figma, DePue claims to know what people want, but says that he does not like that he knows it too well because it messes with what he actually wants to build.

‘The perception of success and the reality of success are disconnected’

In June 2023, DePue announced Alexandria: Project Tenet, an open source project to embed all human belief. It was consistent with 10+ religious texts with over 15 million tokens.

DePue said that a friend of his dad built a chat app with GPT-3 when it was launched two years ago. He sat all night talking to it and his mind was blown – this is what got him into AI.

One of the reasons why DePue is probably moving on to robotics company 1X from OpenAI is his belief that no one should commit to working on a single thing for a very long time and explore more options at a young age.

He suggests people to get into coding and work on something they love instead of a very ambitious project or field whose idea they got from some influencer on social media.

Currently, DePue has started a one-month-long hackathon called ‘10% project’ where participants should commit to a single project for 36 days.

So, when Altman said that OpenAI is not run by a bunch of 24-year-old programmers, it was true. It’s run by younger folks!

“In AI, at least in my opinion, the real 30 under 30 are those you’ve never heard of. They typically exist five layers down the organisational chart from the CEO.” Karpathy couldn’t have been more right.

The post Meet the College Dropout Who Helped OpenAI Build Sora appeared first on AIM.

Master Essential Data Skills to Generate Actionable Business Value with IIMA’s Advanced Business Analytics Programme

While a picture is worth a thousand words, data tells a more compelling story in today’s business world, where analytics and AI are influencing decision-making. “Data are just summaries of thousands of stories,” as Dan Heath, an American author, stated, “Tell a few of those stories to help make the data meaningful.” While data may hold tremendous amounts of business value, it remains inert unless insights are uncovered and translated into actions or business outcomes.

This is where business analytics comes into play. It transforms raw data into actionable insights that can optimise operations, enhance customer experiences, and improve overall business performance.

According to AIM Research, the proportion of data engineers with over six years of experience has increased significantly across various sectors, rising from 27% in 2023 to 38% in 2024. This suggests a growing demand for senior-level talent in the field.

Moreover, there has been a notable increase in the number of data engineers earning between INR 6-10 lakh, which escalated from 24% in 2023 to 32.4% in 2024, marking an increase of 8.4%.

Recognising the critical need for advanced analytical skills in the modern marketplace, the Indian Institute of Management Ahmedabad (IIMA) offers its Executive Programme in Advanced Business Analytics (EPABA), a four-month programme led by Prof Arnab Kumar Laha.

Prof Laha completed his master’s and PhD from the Indian Statistical Institute and has various research and publications to his credit. In 2014, Analytics India Magazine named him among the ‘10 Most Prominent Analytics Academicians’. Other notable faculty members include Prof Ankur Sinha, Prof Kavitha Ranganathan and Prof Dhiman Bhadra.

About the Programme

EPABA seeks to provide participants with comprehensive knowledge of data and analytics and their applications in marketing, finance, operations, and HR, enabling them to become proficient in the latest technologies (artificial intelligence, machine learning, and big data), software (R and Python) and methods (statistical modelling, data visualisation, and forecasting).

The course offers a blend of theoretical knowledge and practical application.

It is structured to accommodate the busy schedules of working professionals, with live interactive classes planned twice a week at designated VCNow Centres across India.

To meet the eligibility criteria for the programme, applicants need to hold a graduate degree in a relevant subject with a minimum of 50% marks. A postgraduate degree in a related field is desirable, and candidates must have at least two years of work experience in a relevant field.

The total course fee for the EPABA programme is INR 4,75,000 plus GST, payable in three installments. Additionally, there is a non-refundable application fee of INR 2,000 plus GST.

Key Features of the Course

  • Mode of Delivery: The course will be delivered through a combination of live interactive online sessions and on-campus modules at IIMA. Participants can attend the live lectures on video in any VCNow centre in India.

These sessions are scheduled conveniently on Fridays from 6 pm to 9 pm and Saturdays from 3 pm to 6 pm, accommodating the busy schedules of working professionals.

Students will get the opportunity to visit the prestigious IIMA campus for an immersive learning experience. This on-campus schedule spans nine days, spread across three visits, allowing participants to delve deeper into the subject matter and gain insights from their first-hand experiences at the institute.

  • Curriculum: The EPABA curriculum is divided into several modules, covering essential topics such as AI, ML, big data, and applications of analytics in essential business functions. This comprehensive coverage ensures that the participants build leadership and managerial capabilities in the Data Age that can meet the industry’s demands.
  • Pedagogy: The teaching methodology is highly interactive, incorporating lectures, case studies, projects, quizzes, group discussions, and simulations. This approach not only enhances learning but also fosters peer-to-peer interaction and networking among participants.
  • Hands-on Experience: Participants will gain practical experience with popular analytical tools such as R and Python, enabling them to automate tasks and extract valuable insights from data. The programme also includes a capstone project, allowing participants to apply their learnings to real-world business scenarios.

Upon meeting the assessment and attendance criterias, successful participants will be awarded a Certificate of Completion (CoC) by IIM Ahmedabad. The participants in the programme will be evaluated through assignments, quizzes and examinations for all the courses.

IIMA is currently accepting applications for the sixth batch of the EPABA programme. The second round of applications closes on May 31, 2024. For more details and application, click here.

The post Master Essential Data Skills to Generate Actionable Business Value with IIMA’s Advanced Business Analytics Programme appeared first on AIM.

These new apps are coming to Windows on Arm, and they’re a big deal

Copilot + PC

It's been a week for Microsoft, kicking things off with its Surface and AI event on Monday ahead of Build. At the hardware event, the company launched Copilot+ PCs, a new tier of Windows computers designed for the AI-powered future.

With the focus mainly on supporting AI applications both on-device and in the cloud, you may have missed Microsoft's nod to the various new creative applications coming to the Arm platform. Here's why those are a pretty big deal, too.

Also: I just ordered the cheapest Surface Pro option — why I (probably) won't regret it

At the Surface and AI event, Microsoft announced that Adobe's flagship apps will be available on the new Copilot+ PCs. I repeat, Microsoft shared that while Photoshop, Lightroom, and Express are already available, creative apps like Illustrator, Premiere Pro, and more will be available this summer.

Why is this a big deal? Until now, Adobe applications have not been designed to run natively on Windows on Arm. Rather, most desktop applications, like Adobe Photoshop, have been developed for the x86-64 architecture used by Intel and AMD processors. That means support for various instruction sets and high-performance multi-core processing was much more refined on the more popular platform.

This discrepancy caused Arm users to have a less-than-ideal experience using Adobe's creative suite, including limited plugin access and more bugs. Even though many Adobe alternatives exist, Adobe remains a staple for many creative professionals, and learning to use an entirely new application can be a real pain point.

Adobe and Microsoft's new partnership means that Adobe is finally creating native Arm64 versions of its applications suited for running natively on Copilot+ PCs, guaranteeing users a more optimal experience when using its services, including all the latest AI tools like Generative Fill in Photoshop and Generative Remove in Lightroom.

Also: I just ordered the cheapest Surface Pro option — why I (probably) won't regret it

"Adobe Creative Cloud customers will benefit from the full performance advantages of Copilot+ PCs to express their creativity faster than ever before," Microsoft said in a release.

Microsoft also announced that other creative apps, such as DaVinci Resolve Studio, CapCut, and Djay Pro, have also been optimized to run natively on the Arm64 processor. AI features such as Magic Mask in DaVinci Resolve Studio and Auto Cutout from CapCut can be carried out with the Neural Processing Unit (NPU) of the Copilot+ PCs. Here's to hoping that this latest commitment will lead to more app developers creating for Windows on Arm.

Windows 10

Investing in Robotics in India is Not for Everyone

Investing in Robotics in India is Not for Everyone

Though promising developments are occurring in the field of robotics, investment challenges in this sector impede rapid progress.

“VCs come in for scaling money. They don’t scale value,” said CynLr Robotics co-founder Gokul NA, in an exclusive interaction with AIM. Elaborating, he said VCs’ struggle in the M&A segments, where adherance to established norms and a lack of technological understanding in order to invest pose problems.

Further, he believes this to be a global problem as well. “Deep tech investments are not VC savvy, and the ones who are supposed to play that role, which should be the corporate VCs, do not know how to play this,” he said.

A similar stance was expressed by SML and Vizzhy founder and CEO Vishnu Vardhan, who also created Hanooman. Vardhan believes that most Indian investors are not ready to spend money on research and deep tech startups.

“If a problem is unsolved for such a large period of time, it’s a technology problem. To build a technology, you need to build a company first. To build a company, you need to build an industry before. So we don’t even have an industry yet,” said Gokul, touching upon the bitter reality of the robotics landscape in the country and globally.

CynLr Robotics is a Bengaluru-based deep tech robotics company, founded by tech enthusiasts Gokul NA and Nikhil Ramaswamy. The startup is backed by Speciale Invest and GrowX Ventures. They recently showcased their semi-humanoid CyRo at the recent Boston Robotics Summit and Expo.

“We don’t want to exit quickly, we are here for the long haul,” said Arjun Rao, one of the founders of Speciale Invest, in an interaction with AIM previously.

Challenges Galore

Experts also highlighted the issue of skill set availability. “How do you train someone on a technology that is yet to be built?” asked Gokul.

In an exclusive interaction with AIM, Flow Drive founder Mankaran Singh attributed the lack of colleges that offer robotics, as one of the major reasons for the lack of interest in this field as a whole.

“There are very few robotics engineers in India, so from a job perspective, there aren’t many companies who are working in the field of robotics and paying good salaries,” he said. With talent becoming a challenge, Singh believes that people will struggle to run robotic companies.

As per Shiksha, there are about 174 colleges offering a BTech in Robotics Engineering course in India. Out of these, 162 colleges are privately owned, and 12 colleges are owned by public or government organisations.

Little to No RoI

The projected revenue of the Indian robotics market is expected to reach $694.10 million. However, challenges remain.

While the skill gap is a major problem, the bigger concern is the return on investment. In India, manufacturing robots come at a very high cost. “Depending on how intelligent or what kind of task the robot performs, they are still very expensive compared to the labour. So, that’s why the RoI is very low in India now,” said Singh.

Furthermore, people are still reluctant to use robotic automation and prefer manual labour. “If you compare to foreign countries, where the cost of labour is very high, they mostly focus on robotics as there’s a good market there,” said Singh.

“They [VCs] are pushing this. They are trying to mobilise this, but that capital availability and sensibility to kind of evaluate whether this tech is needed and how much money will actually come to you to execute will be under a constraint,” said Gokul.

“Lobbying and getting the whole ecosystem around to build tools and components for you is a big bottleneck that we have. Today, we are trying to make it work with what is available and then best tune them,” he said.

Gokul also explained how sourcing parts from different countries also puts pressure on money conversions which can even lead to a loss amounting to crores of rupees in a simple dollar conversion.

Furthermore, despite the huge initial development capex, the results can still be disappointing. “You spend all this money building a robot, and there’s still no guarantee that whatever prototype you have built will be sustained in production,” said Singh.

While investment challenges remain in India, relatively new companies in the West have received massive backing. Figure AI, a deep-tech robotic company known for its humanoid Figure 01, has received $675 million in funding from big players such as NVIDIA, Microsoft, Jensen Huang, and other prominent tech stars. However, this level of investment is still awaited in India.

On the contrary, the robotics field has also been promising for deep tech investors in the Indian autonomous vehicle space. Swaayatt Robots founder Sanjeev Sharma believes that in a year’s time, the company will have a pan-India presence and raised $1 billion. Currently, the Bhopal-based company, which focuses on autonomous driving and claims to have achieved Level 5 autonomy, has a total of $3 million in funding.

The post Investing in Robotics in India is Not for Everyone appeared first on AIM.

Grace Hopper Gets Busy with Science

Nvidia’s new Grace Hopper Superchip (GH200) processor has landed in nine new worldwide systems. The GH200 is a recently announced chip from Nvidia that eliminates the PCI bus from the CPU/GPU communications pathway.

As announced by Nvidia at ISC 2024, New Grace Hopper-based supercomputers coming online include EXA1-HE, in France, from CEA and Eviden; Helios at Academic Computer Centre Cyfronet, in Poland; and Alps at the Swiss National Supercomputing Centre from Hewlett-Packard Enterprise (HPE); JUPITER at the Jülich Supercomputing Centre in Germany; DeltaAI at the National Center for Supercomputing Applications at the University of Illinois Urbana-Champaign; and Miyabi at Japan’s Joint Center for Advanced High Performance Computing — established between the Center for Computational Sciences at the University of Tsukuba and the Information Technology Center at the University of Tokyo.

Recently deployed Grace Hopper Systems (Source: Nvidia)

CEA, the French Alternative Energies and Atomic Energy Commission, and Eviden, an Atos Group company, in April announced the delivery of the EXA1-HE supercomputer based on Eviden’s BullSequana XH3000 technology. The BullSequana XH3000 architecture offers a new, patented warm-water cooling system, while the EXA1-HE is equipped with 477 compute nodes based on Grace Hopper.

“AI is accelerating research into climate change, speeding drug discovery, and leading to breakthroughs in dozens of other fields,” said Ian Buck, vice president of hyperscale and HPC at Nvidia. “Nvidia Grace Hopper-powered systems are becoming an essential part of HPC for their ability to transform industries while driving better energy efficiency.”

In addition, Isambard-AI and Isambard 3 from the University of Bristol in the U.K. and systems at the Los Alamos National Laboratory and the Texas Advanced Computing Center in the U.S. join a growing wave of Nvidia Arm-based supercomputers using Grace CPU and the Grace Hopper platform.

Eliminating the PCI Middleman

The Grace Hopper design combines an Arm-based Grace CPU with a Hopper GPU. Before Grace Hopper, CPUs (usually X86) used one or more PCI-bus-based GPUs. These additional GPUs must communicate over the PCI bus and, therefore, create two or more distinct memory domains: the CPU domain and the GPU domain. Data transfer between these domains must travel across the PCI bus, which often becomes a bottleneck.

Grace Hopper has connected the CPU and GPU using the NVLink-C2C interconnect providing a single shared memory domain. That is a memory-coherent, high-bandwidth, and low-latency interconnect. It is the heart of the Grace Hopper processor and delivers up to 900 GB/s total bandwidth.

Figure 2 shows a 3X performance gain for a coupled ocean/atmospheric model for Grace-Hopper over the traditional PCI bus-based CPU/GPU design.

Figure2: Coupled Ocean Model. Source: Nvidia

Sovereign AI and HPC

The drive to construct new, more efficient AI-based supercomputers is accelerating as countries worldwide recognize sovereign AI’s strategic and cultural importance — investing in domestically owned and hosted data, infrastructure, and workforces to foster innovation.

Bringing together the Arm-based Nvidia Grace CPU and Hopper GPU architectures, the GH200 is a new optimized design for scientific supercomputing centers worldwide. Many centers plan to go from system installation to real science in months instead of years.

As an example, the Isambard-AI phase one consists of an HPE Cray Supercomputing EX2500 with 168 Nvidia GH200 Superchips, making it one of the most efficient supercomputers ever built. When the remaining 5,280 Nvidia Grace Hopper Superchips arrive at the University of Bristol’s National Composites Centre this summer, performance will increase by a factor of thirty-two.

“Isambard-AI positions the U.K. as a global leader in AI and will help foster open science innovation domestically and internationally,” said Prof. Simon McIntosh-Smith, University of Bristol. “Working with Nvidia, we delivered phase one of the project in record time, and when completed this summer, we will see a massive jump in performance to advance data analytics, drug discovery, climate research, and many more areas.”

MoE will Power the Next Generation of Indic LLMs

MoE will Power the Next Generation of Indic LLMs

The potential of MoE in making Indic LLMs is immense. In a recent podcast with AIM, CognitiveLab founder Aditya Kolavi said that the company has been using the MoE (Mixture of Experts) architecture to fuse Indian languages and build multilingual LLMs.

“We have used the MoE architecture to fuse Hindi, Tamil, and Kannada, and it worked out pretty well,” he said.

Similarly, Reliance-backed TWO has released its AI model SUTRA, which uses MoE and supports 50+ languages, including Gujarati, Hindi, Tamil, and more, surpassing ChatGPT-3.5.

Ola Krutrim is also leveraging Databricks’ Lakehouse Platform to enhance its data analytics and AI capabilities while hinting at using MoE to power its Indic LLM platform.

Apart from Indic LLMs, GPT-4, Mixtral-8x7B, Grok-1 and DBRX are powered by MoE. These are some excellent examples of how impactful this architecture is.

How can MoE help India make better LLMs?

Although datasets are available for the 22 official Indian languages, hundreds of other actively used local languages and dialects need representation in Indic LLMs. One challenge that Indian developers face is the lack of quality Indian data.

MoE models are promising in terms of handling machine translation tasks where there is little data available to train on. They prevent the model from over-focusing too narrowly on the limited data, which is a common issue with small datasets.

MoE layers in models allow them to handle multiple languages.

They can learn specific representations for each language while also sharing some core knowledge across languages. This sharing ability is useful for transferring what is learned from data-rich languages like Hindi to other related languages that don’t have as much data available.

DBRX is a great example of how you can achieve efficiency and cost-effectiveness using MoE.

“The economics are so much better for serving. They’re more than 2X better in terms of flops and floating point operations required to do the serving,” shared Navin Rao, the VP of generative AI at Databricks, in an exclusive interaction with AIM.

“DBRX is actually better than Llama 3 and Gemma for Indic languages,” said Ramsri Goutham Golla, the founder of Telugu LLM Labs, in an interview with AIM, particularly in the context of instruction tuning. The company was recently featured in Google I/O for leveraging Gemma to create Navarasa.

In terms of energy efficiency, MoE can help you train larger models with less computing, which is a crucial factor for developing countries like India. For example, Google’s 1.2 trillion parameter GLaM model required only 456-megawatt hours to train, compared to 1,287 for the 175B parameter GPT-3, while outperforming it.

With the help of MoE, one can also reduce the cost while scaling the model. Google’s 1.6T parameter Switch Transformer was trained with a similar computational budget as the 13B T5 model.

Going beyond MoE

Another good example of the MoE model is Jamba, developed by AI21 Labs, which combines the strengths of Transformer and structured state space model (SSM) architectures.

It applies MoE at every other layer, with 16 experts, and uses the top 2 experts at each token. “The more the MoE layers, and the more the experts in each MoE layer, the larger is the total number of model parameters,” wrote AI21 Labs in Jamba’s research paper.

A similar but enhanced approach to MoE can be utilising Recurrent Independent Mechanisms (RIMs). RIMs consist of multiple independent recurrent modules that interact sparsely, allowing for dynamic and modular computation.

They can adapt to changes in the input distribution and handle out-of-distribution generalisation better than Transformers.

Another good idea is using Structured State Space (S4) Models. These use a state space representation to capture long-range dependencies more efficiently than Transformers. Their linear memory footprint and constant memory access make them more scalable for longer sequences.

Simply put, MoE can help India build LLMs, solving complex problems like lack of data, energy requirements and money. While it seems more helpful in merging the already available LLMs, it can also fine-tune future models built from scratch.

The post MoE will Power the Next Generation of Indic LLMs appeared first on AIM.

Google Home pro tip: How to use Gemini to broadcast messages between your Nest devices

A Photo of a Google Speaker.

I'm far from lazy. But sometimes I just want to send my wife a message to let her know (even when I'm knee-deep in a writing project) that I'm thinking of her. Or, maybe I have to communicate something to her and I'm not in a place where I can leave my office (or I'm working in the basement). And because I'm not one for shouting as a form of communication inside my home, I've found a better way to transmit my message — thanks to Gemini and my Google Home Speakers.

We have two speakers, one at the front of the house and one at the back of the house (both on the first floor). That means I can send my messages to either one of those speakers or both of them from the convenience of my phone.

Also: 3 ways Gemini Advanced beats other AI assistants, according to Google

This feature is new, so it can be a bit buggy. Sometimes, I find, there's enough lag that Gemini doesn't catch my entire message. Also, sometimes the message will be broadcast in the default speaker voice and sometimes it will be a recording of my voice. I guess it depends on how Google's latest AI model is feeling at that moment.

Or maybe it's still a bit buggy.

Either way, it's a fun little feature that I accidentally happened upon one day and have used quite often ever since.

Let me show you how it's done.

Do note that this feature can be done with either Gemini or Google Assistant. I switched over to Gemini some time ago but I've tested it with Google Assistant and it works the same. So, whether you've opted into Gemini or are remaining with Assistant (until Google forces the move), you're good to go.

How to broadcast a message

What you'll need: To make this work, you'll need an Android device, at least two Nest speakers —or other displays or speakers — connected to Google Home, and at least one member of your household or organization signed into each of the displays/speakers. Those speakers/displays also have to be connected to the same wireless network and must be using firmware 1.39154941 or later.

With those things taken care of, you're ready to broadcast.

Also: The best smart speakers: Expert tested and reviewed

Also: Smart home starter pack: Top 5 devices you need

I've found the broadcast feature to come in quite handy on several occasions. Besides it being handy, it's also a lot of fun to send someone a message when they're least expecting it.