The Dumbledore of LLMs — AI digitalnews

Making an open-source LLM as powerful as the closed ones like GPT-4 and Claude 3 Opus definitely requires a lot of magic. WizardLM has always been up to that task since it decided to make a model on top of Meta’s Llama 2. And now, it has done that using Mistral’s Mixtral of Experts model.

Undoubtedly, it is all part of the Microsoft partnership.

Introducing WizardLM-2, an open-source SOTA language model that offers improved performance in complex chat, multilingual, reasoning, and agent tasks. The model includes three advanced versions: WizardLM-2 8x22B, WizardLM-2 70B, and WizardLM-2 7B.

The WizardLM-2 8x22B excels in intricate tasks, WizardLM-2 70B offers top-tier reasoning, and WizardLM-2 7B is the fastest while matching the performance of models 10 times its size.

The model weights for WizardLM-2 8x22B and WizardLM-2 7B were available on Hugging Face, which were then pulled down due to an premature release. “It’s been a while since we released a model months ago; we’re unfamiliar with the new release process now. We accidentally missed an item required in the model release process – toxicity testing,” explained the WizardLM account on X.

Too powerful to be out in the open

WizardLM-2 8x22B, the most advanced release falls just short of GPT-4-1106-preview and reaches top tier performance compared to other models of the same size. Moreover, the 7B model even achieves a good performance when compared to existing 10 times larger models such as Qwen1.5 14B model.

The model was also trained on synthetic data generated by AI models, as WizardLM explains, “Natural world’s human data becomes increasingly exhausted through LLM training, we believe that the data carefully created by AI and the model step-by-step supervised by AI will be the sole path towards more powerful AI.”

The Mixture of Experts and multilingual model has a total parameter size of 141 billion. It comes with an Apache 2.0 license, the same as the one offered with Llama 2, making it highly competitive. This comes just when Meta is about to release Llama 3 this week.

When compared on MT-Bench, the metric which is directly correlated with HumanEval on the chatbot arena, the 8x22B model achieved 9.12 rating, which ranks it on the fourth position on the chart, lagging just behind Claude 2 Sonnet with a rating of 9.18.

On the other hand, the model tops the chart when compared to all the other open source models on the same benchmark. In simple terms, this model offers the capabilities of a GPT-4 level model which can run on a single laptop.

The internet wants uncensored models

Microsoft has been a blessing for WizardLM models. As mentioned in the training details, the models were trained on data generated by GPT-4, which apart from being better for training AI models, are also cheaper to access when compared to human data.

On the other hand, Microsoft’s compliance for creating “censored” models has also been a hindrance for WizardLM.

LOLWUT! Wizard LM Is Forced To Pull Down Their Model Because The Censors Hadn’t Blessed It!
Apparently, Wizard LM had to pull its models from HuggingFace because they hadn’t completed “toxicity testing” of their models!!
So they have to go do that before releasing it again!…

— Bindu Reddy (@bindureddy) April 16, 2024

Since the initial uncensored version was pulled down by Microsoft, users on X have been asking for the release of both the versions on Hugging Face, saying developers can decide for themselves which one to use. However, this does not comply with Microsoft’s policy.

“Considering these models are released by ‘Microsoft AI’, I doubt they do anything against the ToS of ‘OpenAI,’ said a user on HackerNews.

The battle of spells

While Llama 3 is almost here, there are other models as well on the horizon that are competing closely with the Dumbledore of LLMs. Google has Gemma. Microsoft also has Phi-2 and Orca. Meanwhile, Amazon remains tightlipped about making smaller models and relying on open-source models.

Microsoft’s bet on WizardLM is definitely its test with open source models while it builds larger and stronger models with OpenAI. Let’s wait and watch where the model lands on the Open LLM Leaderboard when it is launched again.

The post The Dumbledore of LLMs appeared first on Analytics India Magazine.