IBM Unveils Granite 3.0 Models, Outperforms Llama 3.1

IBM has launched Granite 3.0, the latest generation of its large language models (LLMs) for enterprise applications. The Granite 3.0 collection includes several models, highlighted by the Granite 3.0 8B Instruct, which has been trained on over 12 trillion tokens across multiple languages.

The Granite 3.0 8B Instruct model is intended for enterprise use, demonstrating competitive performance against similar models while excelling in specific business tasks. IBM claims that on academic benchmarks included in Hugging Face’s OpenLLM Leaderboard v2, Granite 3.0 8B Instruct rivals similarly sized models from Meta and Mistral AI.

Fine-tuning options through InstructLab allow organisations to customise models to their needs, potentially reducing costs. All Granite models are released under the Apache 2.0 license, with detailed disclosures of training datasets and methodologies included in the accompanying technical paper.

The Granite 3.0 release includes:

General Purpose LLMs: Granite-3.0-8B-Instruct, Granite-3.0-8B-Base, Granite-3.0-2B-Instruct, and Granite-3.0-2B-Base.
Guardrail Models: Granite-Guardian-3.0-8B and Granite-Guardian-3.0-2B for monitoring input and output risks.
Mixture of Experts (MoE) Models: Granite-3.0-3B-A800M-Instruct and Granite-3.0-1B-A400M-Instruct for efficient inference.
Speculative Decoder: Granite-3.0-8B-Instruct-Accelerator for faster token generation.

Developers can utilize the Granite 3.0 8B Instruct model for a range of natural language applications, including text generation, classification, summarization, entity extraction, and customer service chatbots. The model also supports programming tasks such as code generation, code explanation, and code editing, as well as agentic use cases that require tool calling.

Upcoming updates planned for 2024 will increase model context windows to 128K tokens and introduce multimodal capabilities. The Granite 3.0 models are available for commercial use on the IBM watsonx platform and through partners like Google Cloud, Hugging Face, and NVIDIA.

IBM emphasises safety and transparency in AI, with Granite 3.0 models incorporating robust safety features and extensive training data filtering to mitigate risks. The Granite Guardian models enhance input and output management across various dimensions, outperforming existing models in key safety benchmarks.

IBM’s new models leverage innovative training techniques, including the use of the Data Prep Kit for efficient data processing and a power scheduler for optimised learning rates. This enables faster convergence to optimal model weights while minimizing training costs.

Granite 3.0 language models were trained on Blue Vela, powered entirely by renewable energy, reinforcing IBM’s commitment to sustainability in AI development.

The post IBM Unveils Granite 3.0 Models, Outperforms Llama 3.1 appeared first on AIM.

IBM Unveils Granite 3.0 Models, Outperforms Llama 3.1

AMD pulls up the discharge of its next-gen information heart GPUs

Google’s newest Gemini drop consists of Professional entry and Flash-Lite – here is what’s new

Musk strikes to dismiss swimsuit over Tesla’s alleged use of AI-generated ‘Blade Runner’ imagery

U.Okay. Pronounces ‘World-First’ Cyber Code of Apply for Corporations Growing AI

OpenAI’s new Deep Analysis agent can do in 5 minutes what would possibly take you hours

Latest stories

Musk strikes to dismiss swimsuit over Tesla’s alleged use of...

Google’s newest Gemini drop consists of Professional entry and Flash-Lite...

Cherry Ventures raises a brand new $500M fund for early...

You would win $1 million by asking Perplexity a query...

MLDS 2025: Key Highlights from Day 1

You might also like...

Musk strikes to dismiss swimsuit over Tesla’s alleged use of AI-generated ‘Blade Runner’ imagery

Google’s newest Gemini drop consists of Professional entry and Flash-Lite – here is what’s new

Cherry Ventures raises a brand new $500M fund for early stage and past, however will or not it’s sufficient?