At AWS re:Invent 2024, Amazon Web Services (AWS) has announced the general availability of AWS Trainium2-powered Amazon Elastic Compute Cloud (EC2) instances. The new instances offer 30-40% better price performance than the previous generation of GPU-based EC2 instances. “Today, I’m excited to announce the GA of Trainium2-powered Amazon EC2 Trn2 instances,” said AWS chief Matt Garman.
In addition to this, the company also introduced Trn2 UltraServers and unveiled its next-generation Trainium3 AI chip.
The Trn2 instances are built with 16 Trainium2 chips, delivering up to 20.8 petaflops of compute performance. They are intended for training and deploying large language models (LLMs) with billions of parameters.
Trn2 UltraServers combine four Trn2 servers into a single system, offering 83.2 petaflops of compute for higher scalability. These new UltraServers feature 64 interconnected Trainium2 chips.
“The launch of Trainium2 instances and Trn2 UltraServers provides customers with the computational power needed to tackle the most complex AI models, whether for training or inference,” said David Brown, AWS vice president of compute and networking.
AWS is working with Anthropic to create Project Rainier, a large-scale AI compute cluster powered by hundreds of thousands of Trainium2 chips. This infrastructure will support Anthropic’s model development, including the optimisation of its flagship product, Claude, to run on Trainium2 hardware.
Databricks and Hugging Face have partnered with AWS to leverage Trainium2’s capabilities for improved performance and cost efficiency in their AI offerings. Databricks plans to utilise the hardware to enhance the Mosaic AI platform, while Hugging Face integrates Trainium2 into its AI development and deployment tools.
Other customers of Trainium2 include Adobe, Poolside, and Qualcomm. “Adobe is seeing very promising early testing after running Trainium2 against their Firefly inference model, and they expect to save significant amounts of money,” said Garman.
“Poolside expects to save 40% compared to alternative options,” he added. “Qualcomm is using Trainium2 to deliver AI systems that can train in the cloud and then deploy at the edge.”
AWS also previewed its Trainium3 chip, built using a 3-nanometer process node. Trainium3-powered UltraServers are expected in late 2025 and aim to deliver four times the performance of Trn2 UltraServers.
To optimise the use of Trainium hardware, AWS also introduced the Neuron SDK, a suite of software tools that enables developers to optimize their models for peak performance on Trainium chips. The SDK supports frameworks such as JAX and PyTorch, allowing customers to integrate the software into their existing workflows with minimal code changes.
The Neuron SDK also supports over 100,000 models hosted on the Hugging Face model hub, further enhancing its accessibility for AI developers.
Trn2 instances are currently available in the US East (Ohio) region, with expansion to additional regions planned. UltraServers are in preview mode.
The post AWS Unveils Trainium2, Slashes AI Cost by 40% appeared first on Analytics India Magazine.