In 2024, the AI community witnessed the launch of several new large language models (LLMs), such as OpenAI’s o3 and Google Gemini 2, which promised to push the boundaries of what’s possible with AI. However, not all of them could live up to the hype. Despite their impressive features, some models struggled to gain traction, with disappointing adoption rates and limited interest.
Here are some LLMs that failed to make a lasting impact in 2024, even after bold announcements and ambitious claims.
1. Databricks DBRX
Databricks launched DBRX, an open-source LLM with 132 billion parameters, in March 2024. It uses a fine-grained MoE architecture that activates four of 16 experts per input, with 36 billion active parameters. The company claimed that the model outperformed closed-source counterparts like GPT-3.5 and Gemini 1.5 Pro.
However, since its launch, there has been little discussion about its adoption or whether enterprises find it suitable for building applications. The Mosaic team, acquired by Databricks in 2023 for $1.3 billion, led its development, and the company spent $10 million to build DBRX. But sadly, the model saw an abysmal 23 downloads on Hugging Face last month.
2. Falcon 2
In May, the Technology Innovation Institute (TII), Abu Dhabi, released its next series of Falcon language models in two variants: Falcon-2-11B and Falcon-2-11B-VLM. The Falcon 2 models showed impressive benchmark performance, with Falcon-2-11B outperforming Meta’s Llama 3 8B and matching Google’s Gemma 7B, as independently verified by the Hugging Face leaderboard.
However, later in the year, Meta released Llama 3.2 and Llama 3.3, leaving Falcon 2 behind. According to Hugging Face, Falcon-2-11B-VLM recorded just around 1,000 downloads last month.
3. Snowflake Artic
In April, Snowflake launched Arctic LLM, a model with 480B parameters and a dense MoE hybrid Transformer architecture using 128 experts. The company proudly stated that it spent just $2 million to train the model, outperforming DBRX in tasks like SQL generation.
The company’s attention on DBRX suggested an effort to challenge Databricks. Meanwhile, Snowflake acknowledged that models like Llama 3 outperformed it on some benchmarks.
4. Stable LM 2
Stability AI launched the Stable LM 2 series in January last year, featuring two variants: Stable LM 2 1.6B and Stable LM 2 12B. The 1.6B model, trained on 2 trillion tokens, supports seven languages, including Spanish, German, Italian, French, and Portuguese, and outperforms models like Microsoft’s Phi-1.5 and TinyLlama 1.1B in most tasks.
Stable LM 2 12B, launched in May, offers 12 billion parameters and is trained on 2 trillion tokens in seven languages. The company claimed that the model competes with larger ones like Mixtral, Llama 2, and Qwen 1.5, excelling in tool usage for RAG systems. However, the latest user statistics tell a different story, with just 444 downloads last month.
5. Nemotron-4 340B
Nemotron-4-340B-Instruct is an LLM developed by NVIDIA for synthetic data generation and chat applications. Released in June 2024, it is part of the Nemotron-4 340B series, which also includes the Base and Reward variants. Despite its features, the model has seen minimal uptake, recording just around 101 downloads on Hugging Face in December, 2024.
6. Jamba
AI21 Labs introduced Jamba in March 2024, an LLM that combines Mamba-based structured state space models (SSM) with traditional Transformer layers. The Jamba family includes multiple versions, such as Jamba-v0.1, Jamba 1.5 Mini, and Jamba 1.5 Large.
With its 256K token context window, Jamba can process much larger chunks of text than many competing models, sparking initial excitement. However, the model failed to capture much attention, garnering only around 7K downloads on Hugging Face last month.
7. AMD OLMo
AMD entered the open-source AI arena in late 2024 with its OLMo series of Transformer-based, decoder-only language models. The OLMo series includes the base OLMo 1B, OLMo 1B SFT (Supervised Fine-Tuned), and OLMo 1B SFT DPO (aligned with human preferences via Direct Preference Optimisation).
Trained on 16 AMD Instinct MI250 GPU-powered nodes, the models achieved a throughput of 12,200 tokens/sec/gpu.
The flagship OLMo 1B model features 1.2 billion parameters, 16 layers, 16 heads, a hidden size of 2048, a context length of 2048 tokens, and a vocabulary size of 50,280, targeting developers, data scientists, and businesses. Despite this, the model failed to gain any traction in the community.
The post LLMs that Failed Miserably in 2024 appeared first on Analytics India Magazine.