How AI Chips Stole the Spotlight in 2024

How AI Chips Stole the Spotlight in 2024

When discussing AI, one often thinks about excellent software and intelligent programs. Behind the scenes, however, is a vast world of hardware that makes it all possible. Think of AI hardware and chip-making companies as the backstage crew of a big production, ensuring everything is perfectly aligned so AI can shine.

Big tech companies like NVIDIA and AMD make special chips that power everything from driverless cars to smart gadgets.

Companies are always trying to build faster and more powerful chips because AI needs a lot of power to perform its job seamlessly. The race to make the best chip intensifies as AI becomes more advanced. Every tiny improvement makes a big difference.

However, starting a hardware company is no easy feat.

In India, a Surat-based company, Vicharak, took on the herculean task of churning out hardware in-house designed specifically for AI workloads. The company recently secured funding of ₹1 crore, boosting its valuation to ₹100 crore.

We’ve received ₹1 crore in funding at a ₹100 crore valuation. We’ve often heard that doing hardware is hard in a city like Surat or even in India. We’ve gone through hundreds of failed prototypes and iterations in our labs, but somebody has to start at some point, right?
This…

— Vicharak (@Vicharak_In) June 25, 2024

Speaking with AIM, founder and CEO Akshar Vastarpara said that Vicharak’s focus is on creating hardware and redefining computing technology.

“Our first target is to develop a GPU-like technology that can be used in mobile phones, laptops, and servers. We are approaching this in a very different way, starting with the consumer base but scaling to servers and lower-level areas as well,” Vastarpara explained.

This approach led to the creation of Vaaman, a compact computing board featuring a six-core ARM CPU and a field-programmable gate array (FPGA) with 1,12,128 logic cells. Its unique design handles challenges beyond current products, offering 300-Mbps FPGA-CPU connectivity for superior hardware acceleration and parallel computing.

The unique condition garnered a lot of attention on social media.

This is such an insane white pill moment for me.
Back in college I used to such reviews by western YouTubers of western tech companies.
Now time watching an Indian YouTuber make review about a deep tech product of an Indian company.
We’re gonna make it folks. pic.twitter.com/RwkOcxMzmK

— Varsh (@infinite_varsh) June 26, 2024

In this article, AIM explores the importance of AI chips and the most effective strategies that have enhanced their performance in 2024.

The Inference Power Players

In the race to power AI applications, inference chips are the unsung heroes driving real-time decisions, from chatbots to recommendation engines. These specialised processors are the backbone of modern AI, delivering speed and efficiency where it matters most.

To further extend creative possibilities, NVIDIA rolled out its highly anticipated H200 Tensor Core GPU, a successor to the H100, designed for generative AI and high-performance computing workloads. It introduced a faster memory (HBM3E) for improved efficiency​.

Then came the B100 GPU, which utilised the new Blackwell architecture to focus on AI training and inference. This chip is tailored for AI training and inference, continuing NVIDIA’s focus on accelerating AI advancements​.

Earlier this year, NVIDIA launched its GH200 chip, combining a GPU and an ARM-based CPU. By October, OpenAI announced receiving the first engineering builds of NVIDIA’s DGX B200 on X.

Notably, NVIDIA CEO Jensen Huang personally delivered the first GPU chip to Elon Musk and presented the first DGX H200 to OpenAI’s Sam Altman and Greg Brockman.

In a similar vein, Microsoft announced that its Azure platform became the first cloud service to implement NVIDIA’s Blackwell system, featuring AI servers powered by the GB200. NVIDIA reported generating a record-breaking $22.6 billion in data centre revenue this year, a 23% sequential and 427% year-over-year growth, fueled by demand for the Hopper GPU platform. During the earnings call, Huang hinted at future advancements, stating, “After Blackwell, there’s another chip. We’re on a one-year rhythm.”

On the other hand, Google’s parent company Alphabet released two notable AI chips, including the Cloud TPU v5p. The chips were specifically engineered for training LLMs and GenAI with each TPU v5p pod containing 8,960 chips and a bandwidth of 4,800 Gbps. Google also launched Trillium, a high-performance chip for AI data centres offering nearly five times the speed of its predecessor TPU v5e.

​Both chips are integral to Google Cloud’s AI infrastructure, reinforcing Alphabet’s competitive edge in the AI chip market alongside its broader investments in custom hardware.

AMD announced the MI325X AI chip and introduced the series in June 2024. The company created its next generation of Epyc and Ryzen processors and released its latest product — the Zen 5 CPU microarchitecture.

The company launched the MI300A and MI300X AI chips. The MI300A combines a GPU with 228 compute units and 24 CPU cores, while the MI300X is a GPU model featuring 304 compute units. The MI300X and Nvidia’s H100 compete in memory capacity and throughput.

AIM earlier talked about the integration of AMD’s EPYC CPUs with NVIDIA’s HGX and MGX GPUs, which enriches AI and data centre performance while supporting open standards for greater flexibility and scalability.

Similarly, AWS has switched its focus from cloud infrastructure to chips. Its Elastic Compute Cloud Trn1 instances are purpose-built for deep learning and large-scale generative models. They function using AWS Trainium chips and AI accelerators.

The trn1.2xlarge instance was the first iteration. It only had one Trainium accelerator, 32 GB of instance memory and 12.5 Gbps network bandwidth. Now, Amazon has introduced the trn1.32xlarge instance, which has 16 accelerators, 512 GB of instance memory and 1,600 Gbps ability. The company is planning to roll out its latest AI chip, Trainium 2, in the upcoming month. As the Financial Times reported, the chip will likely support targeting AI model training at scale.

“The second version of Trainium – Trainium 2 – will start to ramp up in the next few weeks, and I think it’s going to be very compelling for customers on a price-performance basis,” said former AWS chief Andy Jassy.

The report further revealed that Amazon’s other AI chip, Inferentia, saves customers approximately 40% on costs for generating responses from AI models.

In a bid to keep pace with the growing demand for semiconductors capable of training and deploying large AI models, Intel announced its latest AI chip Gaudi 3 at Intel Vision 2024.

The chip, first revealed by CEO Pat Gelsinger at the Intel AI Everywhere event, has double the power efficiency of its predecessor and is capable of running AI models 1.5 times faster than NVIDIA’s H100 GPU.

It offers various configurations, including a bundle of eight Gaudi 3 chips on one motherboard or a card that can be integrated into existing systems.

Gaudi 3, built on a 5 nm process, signals Intel’s use of manufacturing techniques. According to Gelsinger, Intel plans to manufacture AI chips, potentially for external companies, at a new Ohio factory, which is expected to open in the upcoming years.

On the Edge of Innovation

Training and edge AI chips are the secret sauce fueling AI’s learning process, whether in the cloud or directly on your device. These chips transform raw data into actionable intelligence, driving AI’s next big leap.

American AI company Cerebras Systems, in collaboration with Abu Dhabi-based AI holding company G42, announced the development of Condor Galaxy 3 (CG-3), the latest addition to their AI supercomputing constellation, in 2024.

CG-3 features 64 of Cerebras’ newly launched CS-3 systems, each powered by the WSE-3 chip. It will be available by Q2 2024 and is set to deliver eight exaFLOPs of AI computing power. This marks the third generation of AI supercomputers released by Cerebras Systems in collaboration with G42.

The CS-3 chip, also unveiled by Cerebras, boasts 4 trillion transistors and offers 125 petaflops of peak AI performance per chip. The WSE-3 is designed to double the performance of its predecessor while maintaining the same power consumption and price, making it ideal for training the industry’s most significant AI models.

This announcement follows the release of the second phase of the Condor Galaxy supercomputer, known as Condor Galaxy 2, last November.

Apple Neural Engine, specialised cores based on Apple chips, furthered the company’s AI hardware design and performance. The neural engine led to the M1 chip for MacBooks. Compared to the generation before, MacBooks with an M1 chip are 3.5 times faster in general performance and five times faster in graphic performance.

After the success of the M1 chip, the company announced further generations. As of 2024, Apple released the M4 chip, but it is only available in iPad Pro. The M4 chip has a neural engine that is three times faster than the M1 chip and CPU that is 1.5 times faster than the M2.

“The new iPad Pro with M4 is a great example of how building best-in-class custom silicon enables breakthrough products,” said Johny Srouji, Apple’s senior vice president of hardware technology.

After the success of its first specialised AI chip, Telum, IBM introduced its Telum II Processor in August. This processor is designed to power the next-generation IBM Z systems. In addition, IBM unveiled the Spyre Accelerator at the Hot Chips 2024 conference. These chips are likely to become available in 2025.

Clearly, they are determined to design a powerful successor that can outpace its competitors.

Currently, IBM is working on the NorthPole AI chip, which does not have a release date.

The post How AI Chips Stole the Spotlight in 2024 appeared first on Analytics India Magazine.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...