AMD’s Su-premacy Begins

This yr, AMD’s Advancing AI occasion was on one other stage. The corporate made it clear it’s not afraid of NVIDIA. It launched the brand new Intuition MI350 Sequence GPUs, constructed on the CDNA 4 structure, promising a fourfold generational enchancment in AI compute and a 35x leap in inferencing efficiency.

It additionally launched ROCm 7.0, its open software program stack for GPU computing and previewed the upcoming MI400 Sequence and Helios AI rack infrastructure.

The corporate mentioned that MI350X and MI355X GPUs function 288GB of HBM3E reminiscence and supply as much as 8TB/s of reminiscence bandwidth. “MI355 delivers 35x increased throughput when working at ultra-low latencies, which is required for some real-time purposes like code completion, simultaneous translation, and transcription,” mentioned AMD CEO Lisa Su.

Su mentioned that fashions like Llama 4 Maverick and DeepSeek R1 have seen triple the tokens per second on the MI355 in comparison with the earlier era. This results in sooner responses and better consumer throughput. “The MI355 provides as much as 40% extra tokens per greenback in comparison with NVIDIA B200,” she added.

Every MI355X platform can ship as much as 161 PFLOPs of FP4 efficiency utilizing structured sparsity. The collection helps each air-cooled (64 GPUs) and direct liquid-cooled (128 GPUs) configurations, providing as much as 2.6 exaFLOPs of FP4/FP6 compute.

The Intuition MI400 Sequence, anticipated in 2026, will function as much as 432GB of HBM4 reminiscence and 19.6TB/s of bandwidth. It’s set to ship 40 PFLOPs of FP4 and 20 PFLOPs of FP8 efficiency.

Talking concerning the firm’s open-source software program ROCm, Vamsi Boppana, senior vp of AMD’s synthetic intelligence group, mentioned it now powers among the largest AI platforms on the earth, supporting main fashions like Llama and DeepSeek from day one, and delivering over 3.5x inference good points within the upcoming ROCm 7 launch.

He added that frequent updates, help for FP4 knowledge varieties, and new algorithms like FAv3 are serving to ROCm ship higher efficiency and push open-source frameworks like vLLM and SGLang forward of closed-source choices.

“With over 1.8 million Hugging Face fashions working out of the field, business benchmarks now in play, ROCm is not only catching up—it’s main the open AI revolution,” he added.

Enormous Accomplice Ecosystem

AMD is working with main AI corporations, together with Meta, OpenAI, xAI, Oracle, Microsoft, Cohere, HUMAIN, Purple Hat, Astera Labs and Marvell. Su mentioned the corporate expects the marketplace for AI processors to exceed $500 billion by 2028.

The occasion, which befell in San Jose, California, additionally noticed OpenAI CEO Sam Altman sharing the stage with Su. “We’re working intently with AMD on infrastructure for analysis and manufacturing. Our GPT fashions are working on MI300X in Azure, and we’re deeply engaged in design efforts on the MI400 Sequence,” Altman mentioned.

Then again, Meta mentioned its Llama 3 and Llama 4 inference workloads are working on MI300X and that it expects additional enhancements from the MI350 and MI400 Sequence.

Oracle Cloud Infrastructure is among the many first to undertake the brand new system, with plans to supply zettascale AI clusters comprising as much as 131,072 MI355X GPUs. Microsoft confirmed that proprietary and open-source fashions at the moment are working in manufacturing on Azure utilizing the MI300X.

Cohere mentioned its Command fashions use the MI300X for enterprise inference. HUMAIN introduced a partnership with AMD to construct a scalable and cost-efficient AI platform utilizing AMD’s full compute portfolio.

Constructing the Infrastructure for AI Brokers

AMD introduced its new open normal rack-scale infrastructure to satisfy the rising calls for of agentic AI workloads, launching options that combine Intuition MI350 GPUs, fifth Gen EPYC CPUs, and Pensando Pollara NICs.

“Now we have taken the lead on serving to the business develop open requirements, permitting everybody within the ecosystem to innovate and work collectively to drive AI ahead. We totally reject the notion that one firm may have a monopoly on AI or AI innovation,” mentioned Forrest Norrod, AMD’s government vp.

The corporate additionally previewed Helios, its next-generation rack platform constructed across the upcoming MI400 GPUs and Venice CPUs. Su mentioned Venice is constructed on TSMC’s 2-nanometer course of, options as much as 256 high-performance Zen 6 cores, and delivers 70% extra compute efficiency than AMD’s current-generation management CPUs.

“Helios features like a single, huge compute engine. It connects as much as 72 GPUs with 260 terabytes per second of scale-up bandwidth, enabling 2.9 exaflops of FP4 efficiency,” she mentioned, including that in comparison with the competitors, it helps 50% extra HBM4 reminiscence, reminiscence bandwidth, and scale-out bandwidth.

AMD’s Venice CPUsbring as much as 256 cores and better reminiscence bandwidth, whereas Vulcano AI NICs help 800G networking and UALink. “Choosing the proper CPU will get probably the most out of your GPU,” mentioned Norrod.

Helios makes use of UALink to attach 72 GPUs as a unified system, providing open, vendor-neutral scale-up efficiency.

Describing UALink as a key differentiator, Norrod mentioned considered one of its most essential options is that it’s “an open ecosystem” — a protocol that works throughout programs whatever the CPU, accelerator, or change model. He added that AMD believes that open interoperability accelerates innovation, protects buyer selection, and nonetheless delivers management efficiency and effectivity.

As AI workloads develop in complexity and scale, AMD says a unified stack is important, combining high-performance GPUs, CPUs, and clever networking to help multi-agent programs throughout industries.

The at the moment obtainable resolution helps as much as 128 Intuition MI350 GPUs per rack with as much as 36TB of HBM3E reminiscence. The infrastructure is constructed on Open Compute Venture (OCP) requirements and Extremely Ethernet Consortium (UEC) compliance, permitting interoperability with present infrastructure.

OCI might be among the many first to undertake the MI355X-based rack-scale platform. “We might be one of many first to supply the MI355X rack-scale infrastructure utilizing the mixed energy of EPYC, Intuition, and Pensando,” mentioned Mahesh Thiagarajan, EVP at OCI.

Moreover that, the brand new Helios rack resolution, anticipated in 2026, brings tighter integration and better throughput. It consists of next-gen MI400 GPUs, providing as much as 432GB of HBM4 reminiscence and 40 petaflops of FP4 efficiency.

The put up AMD’s Su-premacy Begins appeared first on Analytics India Journal.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...