Last month, Microsoft released Orca, a 13-billion parameter model, also touted as the open source, and a smaller alternative to GPT-4, that learns to imitate the reasoning processes of large language models. This small model learns from rich signals from GPT-4 including explanation traces; step-by-step thought processes, along with other complex instructions, guided by teacher assistance from ChatGPT.
At the time, the team had only released a preview of the model citing LLaMA’s release policy restrictions. With the recent LLaMA 2 commercial licence availability, released in partnership with Meta, we could expect the release of a powerful, smaller models soon.
The LlaMA 2 ranges from 7 to 70 billion parameters. According to Meta, they have shown superior performance compared to open-source chat models, LlaMA, Alpaca, and Vicuna, on most benchmarks tested.
Microsoft is offering LlaMA 2 for use on the Azure AI catalogue, allowing people to access it through cloud tools like content filtering. Additionally, the tool can run on Windows PCs.
Inside Orca
Developed by Microsoft, Orca is a 13 billion-parameter model that outperforms conventional open-source models like LlaMA, Alpaca, and Vicuna.
The authors of the paper highlight that previous models lacked rigorous evaluation, leading to an overestimation of their capabilities. In contrast, Orca was specifically designed to imitate the reasoning process of larger models through progressive learning.

To achieve this, Orca was trained to imitate the step-by-step thought processes of GPT-4, a much larger model. It received teacher assistance from GPT-3.5 through explanation traces. This allowed Orca to learn more efficiently and produce richer explanations. The authors used system messages and complex tasks from the FLAN collection to enhance the model’s performance.
Results from various benchmarks demonstrate Orca’s impressive capabilities. In complex zero-shot reasoning benchmarks, such as the Big Bench Hard, Orca surpasses vicuna by more than 100% and outperforms GPT-4 in specific reasoning tasks. In open-ended generation, Orca achieves 95% of ChatGPT’s quality and 85% of GPT-4’s quality.
The model also shows promise in academic and professional examinations like the SAT, LSAT, GRE, and GMAT. In the Big Bench Hard benchmark, which includes 23 of the hardest tasks for language models, Orca significantly outperforms previous open-source models and even matches ChatGPT’s performance.
The research highlights the importance of leveraging system instructions and step-by-step explanations for better model performance. Orca’s ability to learn from detailed responses and reasoning processes of GPT-4 and ChatGPT has proven crucial in its success.
Orca is not alone
Looks like Orca, has a new competitor. Recently, Alignment Lab AI unveiled OpenOrca-Preview1-13B, a smaller model that mimics the behaviour of large language models like GPT-4, which is very similar to Microsoft’s Orca.
In an attempt to reproduce the dataset generated for Microsoft Research’s Orca, the team have also used the OpenOrca dataset to fine-tune LLaMA-13B. “We have trained on less than 6% of our data, just to give a preview of what is possible while we further refine our dataset!,” shared team Alignment Lab AI, saying that they have trained a refined section of 200K GPT-4 entries from OpenOrca.
The team have further filtered out GPT-4 augmentations to remove statements like “As an AI language model…” and other responses which have been shown to harm model reasoning capabilities. It further said that their dataset curation practices will be forthcoming with their full model releases.
The team said that the preview release shows that even a smaller portion of their training data can produce SOTA results in this model class with training costs less than $200.
Why is Microsoft Betting on Smaller LLMs?
With the recent partnership with Meta, Microsoft is looking to be both at the start and finish line of the generative AI race. Their backing of the open source research projects like Orca and their extended partnership with OpenAI gives them all the benefits of a double edged sword.
Meta, along with Microsoft has also partnered with Qualcomm, eyeing an entire ecosystem to make LlaMA 2 AI implementations available on phones and PCs starting next year. Smaller models like Orca can take full advantage of this.
“This will allow customers, partners and developers to build use cases, such as intelligent virtual assistants, productivity applications, content creation tools, entertainment and more,” Qualcomm said on Tuesday. “These new on-device AI experiences, powered by Snapdragon, can work in areas with no connectivity or even in aeroplane mode.”
The post Will Microsoft Unleash an Army of Tiny LLMs with LLaMA 2? appeared first on Analytics India Magazine.