The Allen Institute for AI (AI2) has unveiled OLMo 7B and OLMo 1B, setting a new standard by releasing not only the model but also its pre-training data and training code. This comprehensive release distinguishes OLMo as a truly open model, providing researchers and developers unprecedented access to advance the collective science of language models.
OLMo and its accompanying framework are strategically crafted to assist researchers in training and experimenting with large language models and comes with an Apache 2.0 licence.
Accessible for direct download on Hugging Face and GitHub.
This also comes along with Nathan Lambert from AI2 posting a torrent magnet link on X of the 1 billion model, reminding of how Mistral releases its open source models.
magnet:?xt=urn:btih:7b212968cbf47b8ebcd7017a1e41ac20bf335311&xt=urn:btmh:122043d0d1a79eb31508aacdfe2e237b702f280e6b2a1c121b39763bfecd7268a62d&dn=ai2-model
release 49c8647f439c324f564651c83bd945c0140c2750
err not sure you should get models like this but enjoy— Nathan Lambert (@natolambert) February 1, 2024
This initiative is the result of a collaborative effort involving the Kempner Institute for the Study of Natural and Artificial Intelligence at Harvard University, along with key partners such as AMD, CSC (Lumi Supercomputer), and Paul G. Allen School of Computer Science & Engineering at the University of Washington, and Databricks.
The OLMo framework introduces a suite of fully open AI development tools, including:
- Full Pretraining Data: Built on AI2’s Dolma set, a three trillion-token open corpus, OLMo’s pretraining data incorporates language model pretraining code.
- Training Code and Model Weights: OLMo’s framework encompasses complete model weights for four 7B scale model variants, each trained to at least 2 trillion tokens. Inference code, training metrics, and training logs are all provided.
- Evaluation: The evaluation suite used in development, featuring 500+ checkpoints per model, is released under the Catwalk project, including evaluation code.
Julien Chaumond, CTO of Hugging Face posted on LinkedIn appreciating AI2’s efforts of releasing the model with weights, evaluation metrics, and training code, highlighting the dedication to open source.
Eric Horvitz, Microsoft’s Chief Scientific Officer, and a founding member of the AI2 Scientific Advisory Board expressed his enthusiasm, stating, “The new offering continues Allen AI’s tradition of providing valuable open models, tools, and data, which have spurred numerous advancements in AI across the global community.”
Noah Smith, OLMo project lead, emphasised the significance of openness in AI research, stating, “With OLMo, open actually means ‘open,’ and everyone in the AI research community will have access to all aspects of model creation.”
The post Allen Institute for AI Releases OLMo, Open Source 7B and 1B Models appeared first on Analytics India Magazine.