IBM Unveils Granite 3.0 Models, Outperforms Llama 3.1 

IBM has launched Granite 3.0, the latest generation of its large language models (LLMs) for enterprise applications. The Granite 3.0 collection includes several models, highlighted by the Granite 3.0 8B Instruct, which has been trained on over 12 trillion tokens across multiple languages.

The Granite 3.0 8B Instruct model is intended for enterprise use, demonstrating competitive performance against similar models while excelling in specific business tasks. IBM claims that on academic benchmarks included in Hugging Face’s OpenLLM Leaderboard v2, Granite 3.0 8B Instruct rivals similarly sized models from Meta and Mistral AI.

Fine-tuning options through InstructLab allow organisations to customise models to their needs, potentially reducing costs. All Granite models are released under the Apache 2.0 license, with detailed disclosures of training datasets and methodologies included in the accompanying technical paper.

The Granite 3.0 release includes:

  • General Purpose LLMs: Granite-3.0-8B-Instruct, Granite-3.0-8B-Base, Granite-3.0-2B-Instruct, and Granite-3.0-2B-Base.
  • Guardrail Models: Granite-Guardian-3.0-8B and Granite-Guardian-3.0-2B for monitoring input and output risks.
  • Mixture of Experts (MoE) Models: Granite-3.0-3B-A800M-Instruct and Granite-3.0-1B-A400M-Instruct for efficient inference.
  • Speculative Decoder: Granite-3.0-8B-Instruct-Accelerator for faster token generation.

Developers can utilize the Granite 3.0 8B Instruct model for a range of natural language applications, including text generation, classification, summarization, entity extraction, and customer service chatbots. The model also supports programming tasks such as code generation, code explanation, and code editing, as well as agentic use cases that require tool calling.

Upcoming updates planned for 2024 will increase model context windows to 128K tokens and introduce multimodal capabilities. The Granite 3.0 models are available for commercial use on the IBM watsonx platform and through partners like Google Cloud, Hugging Face, and NVIDIA.

IBM emphasises safety and transparency in AI, with Granite 3.0 models incorporating robust safety features and extensive training data filtering to mitigate risks. The Granite Guardian models enhance input and output management across various dimensions, outperforming existing models in key safety benchmarks.

IBM’s new models leverage innovative training techniques, including the use of the Data Prep Kit for efficient data processing and a power scheduler for optimised learning rates. This enables faster convergence to optimal model weights while minimizing training costs.

Granite 3.0 language models were trained on Blue Vela, powered entirely by renewable energy, reinforcing IBM’s commitment to sustainability in AI development.

The post IBM Unveils Granite 3.0 Models, Outperforms Llama 3.1 appeared first on AIM.

Follow us on Twitter, Facebook
0 0 votes
Article Rating
Subscribe
Notify of
guest
0 comments
Oldest
New Most Voted
Inline Feedbacks
View all comments

Latest stories

You might also like...