At Advancing AI 2024, AMD recognised the explosive growth of AI models and the strain it places on both network and computational infrastructure.
The company showed its focus on ensuring that every component—from CPUs, GPUs, and DPUs to networking infrastructure—is tailored to fuel AI workloads efficiently and at scale. AMD’s innovations aim to eliminate any barriers that prevent GPUs from operating at their full potential, enabling faster training and inference cycles, and ultimately creating an ecosystem where AI models can grow without limits.
“AMD is the only company that can deliver the full set of CPU, GPU, and networking solutions to address all of the needs of the modern data center. And, we have accelerated our roadmaps to deliver even more innovations across both our Instinct and EPYC portfolios, while also working with an open ecosystem of other leaders to deliver industry-leading networking solutions,” said Lisa Su, chair and CEO at AMD.
AMD’s senior vice president of networking, Soni Jiandani, emphasised the growing demands of AI, stating, “Networking is not only critical, but it’s foundational to drive optimal performance.”
With the launch of the UEC-ready Polara 400 AI networking adapter, AMD promises to elevate performance by ensuring seamless GPU communication across AI clusters, meeting the ever-growing demands of modern AI workloads.
This innovation, along with other announcements such as the third-generation Selena DPU, positions AMD to lead the computing and networking industries to the next level, offering up to a six-fold performance improvement in AI training times.
“The explosive growth of AI workloads requires innovations across GPUs, CPUs, and networking. AMD is innovating on all fronts to address this growth with our products like Polara 400 and Selena DPU.” said Jiandani.
Polara 400 AI Networking Adapter: This new networking card is the first UEC-ready AI adapter designed to improve GPU communication, reducing congestion and improving job completion time by intelligently rerouting data across optimal paths
Selena DPU: AMD announced the release of its third-generation DPU, the Selena, which offers 400 GB throughput, enabling AI workloads to accelerate with better security, load balancing, and congestion management
“To truly feed the AI beast, we must innovate across every component—GPUs, CPUs, and networking. The Polara 400 and Selena DPU are designed to handle the immense scale and speed required by AI workloads, ensuring the AI beast is always running at its full potential,” said Jiandani.
Forrest Norrod, EVP and GM at data center solution business at AMD, said that both Selena and Polara will be available early next year.
Rajagopal Subramaniyan, SVP of networking at Oracle Cloud Infrastructure (OCI) said that it is leveraging AMD’s Elba DPU for high performance and scalability in their AI workloads.
“The programmable architecture of Elba of the DPUs has helped us leverage the flexibility to launch features and services for our customers with software-like agility, while maintaining line rate at the hardware. The Elba delivered 5x improvement in performance… which was critical for us to cater to the demands of high-performance requirements of some of our key customers,” he added, looking forward to using Selena and Polara.
IBM Cloud’s Ajay Apte said that it has adopted AMD’s DPU for virtualised environments, improving security and performance for enterprise AI and cloud workloads.
“By moving our crown jewel, which is the SDN stack, into the DPU, we are essentially increasing the security aspect of our network… with AMD Pensando’s programmable DPUs, we are now running our virtualised machine offerings on a more secure and performance-optimized platform,” he added.
Builds the Next-Gen AI Smart Switch
At Advancing AI, Cisco, Microsoft, and AMD announced that they are joining forces to develop a next-generation AI Smart Switch that integrates AMD’s programmable DPU into data center switches. This collaboration aims to offload tasks such as security, load balancing, and network traffic management from CPUs and GPUs, significantly improving the performance of AI networks.
Microsoft Azure general manager and head of products, Narayan Annamalai highlighted the importance of this development, stating, “We are building an extended network… using DPUs to offload the workload and apply security policies before traffic hits storage or services.”
“Hypershield is our AI-driven, next-generation security architecture that runs on top of AMD’s Pensando DPU inside Cisco servers,” added Cisco’s senior vice president, Jeremy Foster.
This partnership brings AMD’s DPU technology into mainstream networking, delivering faster, smarter, and more secure AI infrastructures. By embedding AMD’s DPUs into Cisco switches and Microsoft’s infrastructure, these companies are ensuring that AI networks can scale efficiently and handle the demands of emerging AI workloads.
“Instead of having everything done within the compute servers, we can bring that into the switch over time and implement those technologies… This capability becomes just another capability in your networking fabric across all the switches you are deploying,” said SVP and general manager of datacenter and provider connectivity at Cisco, Kevin Wollenweber.
Bets Big on Ethernet
AMD is betting big on Ethernet as the foundation for its AI vision, positioning the technology as the most scalable, cost-effective solution for the growing demands of AI workloads.
“Ethernet is clearly the preferred choice for back-end and front-end networks for AI workloads… delivering over 50% cost savings and scalability advantages,” avered Jiandani.
For instance, AMD’s new Polara 400 AI networking adapter and its third-generation Selena DPU are at the heart of this Ethernet-driven AI infrastructure, enabling faster data transmission, congestion management, and enhanced scalability across massive GPU clusters. As AI workloads continue to grow exponentially, AMD’s Ethernet solutions provide a clear advantage over competitors, allowing for seamless scaling beyond the limitations of traditional architectures like Infiniband.
“The foundational architecture of Infiniband is not poised to scale beyond 48,000 nodes without making dramatic and highly complex workarounds, whereas Ethernet has really proven itself to scale to millions of nodes, delivering huge scalability advantages,” shared Jiandani, ahead of Advancing AI 2024.
AMD believes that its approach to AI networking based on Ethernet helps its customer scale AI clusters, reducing network congestion, and ensuring seamless communication between GPUs.
Polara 400 and Selena DPU: These products form the core of AMD’s AI networking strategy, using Ethernet to handle increasing data traffic without the complexity and cost associated with alternative networking solutions(AMD Pensando Press Deck).
AMD’s Ethernet-based approach stands in contrast to competitors like NVIDIA, which relies on solutions like Spectrum-X that incorporate both NICs and switches. By focusing on Ethernet, AMD looks to deliver cost-effective, scalable AI systems without needing additional proprietary hardware.
Darrick Horton, CEO of TensorWave, said that it is building large AI clusters with AMD’s Polara Ai networking adapters to overcome networking congestion challenges. “With traditional Ethernet, we face challenges with congestion and flow management… Polara Ethernet will allow us to build larger clusters and improve workload efficiency, specifically around job completion times and overall hybrid utilization,” he added.
Going beyond, AMD is also leading the Ultra Ethernet Consortium (UEC), a coalition of 97 industry-leading vendors, which aims to standardise Ethernet protocols for AI networks, creating a robust ecosystem for high-performance AI and cloud computing.
The Future of AI Systems with AMD’s P4 Engine
At Advancing AI 2024, AMD also spoke about revolutionising AI systems with its fully programmable P4 engine, designed to accelerate AI networking and offload critical tasks from GPUs and CPUs.
Interestingly, this innovative engine is at the core of AMD’s Polara 400 and Selena DPU, providing AI systems with the flexibility and performance required to manage the vast amounts of data generated by modern AI workloads.
“At the core of this solution powering innovation for both AMD and our customers is our fully programmable P4 engine, which enables us to adapt to the evolution of Ethernet and accommodate AI networking needs,” said Jiandani.
AMD’s P4 engine supports 400-gigabit line-rate throughput and can scale AI networks to handle millions of GPUs, delivering unmatched scalability.
“Our fully programmable P4 engine puts AMD in a very unique position because we now have the ability to deliver a holistic portfolio that is future-proof and provides end-to-end network solutions through full programmability,” added Jiandani, saying how P4 engine is poised to lead the next generation of AI systems, providing AI workloads with the flexibility to evolve alongside emerging industry standards, outpacing competitors in both performance and adaptability.
The post AMD’s Newfound Obsession with ‘Feeding the AI Beast’ appeared first on AIM.