![aws-trainium2-chip-11-29-24-embargo-till-12-3-24](https://www.zdnet.com/a/img/resize/281dd96543ba4b273428dc570f881a11728f43b8/2024/12/02/1471660a-bdcc-4a60-97d2-0d6ed5f362dd/aws-trainium2-chip-11-29-24-embargo-till-12-3-24.jpg?auto=webp&width=1280)
Trainium2, first announced a year ago, is aimed at training large language models with trillions of parameters. It is now generally available on Amazon AWS's EC2 instances.
At its annual re:Invent conference in Las Vegas, Monday, Amazon's AWS cloud computing service disclosed the third generation of its Trainium computer chip for training large language models (LLMs) and other forms of artificial intelligence (AI), a year after the debut of the second version of the chip.
The new Trainium3 chip, which will become available next year, will be up to twice as fast as the existing Trainium2 while being 40% more energy-efficient, said AWS CEO Matt Garman during his keynote on Tuesday.
Also: AWS says its AI data centers just got even more efficient – here's how
Trainum3 is the first chip from AWS to use a three-nanometer semiconductor manufacturing process technology.
In the meantime, the Trainium2 chips unveiled a year ago are now generally available, said Garman. The chips are four times faster than the previous generation. The chips are geared toward LLM training, and Garman emphasized performance on Meta Platforms's popular open-source model, Llama.
"Independent inference performance tests for Meta's Llama 405B showed that Amazon Bedrock, running on Trn2 instances, delivers more than 3x higher token-generation throughput compared to other available offerings by major cloud providers," the company says.
Also: The best web hosting services: Expert tested and reviewed
Amazon also announced UltraServers, a new offering for AWS's Elastic Compute Cloud service that connects 64 of the current Trainium2 chips "into one giant server", using NeuronLink interconnections. The servers are available now on EC2.
An UltraServer rack will run Trainium2 UltraServers that operate 64 Trainium2 chips for generative AI.
The UltraServer is designed to handle LLMs with trillions of parameters, said Amazon. To aid development for the Trainium parts, the company rolled out a software development kit, known as Neuron, that includes a compiler, runtime libraries, and tools optimized for Trainium. Neuron has native support for "popular frameworks" in AI such as JAX and PyTorch, and "over 100,000 models on the Hugging Face model hub".
Also: I changed 5 ChatGPT settings and instantly became more productive – here's how
Garman also gave a sneak peek at future developments. New versions of the UltraServers running Trainium3 are expected to be four times "more performant" than the Trainium2-based UltraServers, "allowing customers to iterate even faster when building models and deliver superior real-time performance when deploying them."
The company said work is underway to build "Project Rainier", which would be an "UltraCluster" grouping numerous UltraServers to allow access to "hundreds of thousands of Trainium2 chips".
Also: The best AI for coding in 2024 (and what not to use)
The UltraCluster is being developed in partnership with Gen AI startup Anthropic.
Re:Invent runs through Friday, Dec. 6 and you can register for free to watch the livestream on the event site.