Image by Author | Bing Image Creator
We are seeing rapid development of ChatGPT open-source alternatives, and some of them, like Vicuna, are producing amazing results. But there is a catch. These new models have restricted licenses. We cannot use them for commercial use.
On other hand, Open Assistant is trying to change that. Their mission is to give everyone access to a great chat-based large language model like ChatGPT and GPT-4.
In this post, we will learn about the Open Assistant project, its features, limitations, and plans. Moreover, we will provide you with all the resources to start creating your chatbot.
What is an Open Assistant?
The Open-Assistant project is revolutionizing language innovations. Instead of keeping high-quality large language models private, they are letting everyone use datasets, models, code sources, and the Open Assistant platform.
The Open-Assistant models are trained on a dataset that was collected from more than 13,000 volunteers. The collected dataset has over 600K interactions, 150K messages, and 10K fully annotated conversation trees on diverse topics in multiple languages.
Watch the launch video to understand how cool this project is.
If you go to their Hugging Face page, you will see multiple model architectures trained on the Open Assistant dataset, for example, Stable LM, LLaMA, Pythia, Galactica, and more. They are working on a state-of-the-art model on the latest data, and soon they will launch that model with security features.
Note: some of the models have restricted licenses (for research only), like LLaMA, but you will also see models like Pythia that are open for any use.
How To Try It Out
You can check out a Hugging Face demo to interact with the model or sign up for free to official chat to experience state-of-the-art models.
As we all know that the project is created by an open-source community for the community, you will see options to improve the chat and contribute to data collection.
Chatting with the AI
Open Assistant lets you chat with a chatbot and give feedback on its responses. To start, sign up and click on the chat button. Then, use the thumbs-up or down icons to react to the chatbot's messages and help it learn.
Image from Chat
Contributing to Data Collection
The data collection UI is quite simple. Just click on the Dashboard button, select the task, and start contributing. You can improve the capabilities of Open Assistant by submitting, ranking, and labeling model prompts and responses.
Image from Open Assistant
When you make a valid contribution to the dataset, your score will be shown on a public leaderboard. This is a way of gamifying the contribution process.
Image from Open Assistant Limitations
The limitations of Open Assistant are limitations of most open-source large language models. These models are trained on fewer coding and math interactions which results in failing horribly at answering math and coding questions.
The model is good at generating interesting answers and is more human-like, but sometimes it produces factually wrong or misleading answers.
You need to understand that these models are small compared to ChatGPT and there will be limitations.
Future Plan
The Open Assistant founders have a vision of creating an assistant of the future that can perform various tasks such as writing emails, doing meaningful work, using APIs, and dynamically researching information. Moreover, they want their assistant to be customizable and extensible to anyone who uses it.
- They will continue to collect more high-quality data and train better models.
- Their vision is to create a unified platform that includes conversational assistants, retrieval via search engines, integration of APIs and third-party integrations, and building blocks for developers.
- They still have a few private models that they want to make public after working on security features.
- The community is working on launching a methodology that will help train and run large language models on consumer-based GPUs.
Getting Started
The Open Assistant project is fully transparent and licensed for commercial use. Only a few models, such as LLaMa, are restricted. Everything else, including models, datasets, code, inference, paper, demo, and documentation, is free and public.
The platform lets you contribute to the dataset and climb the leaderboard. You can also train your model with the public dataset. Explore the endless possibilities.
- Official Page: Open Assistant | Open Assistant (laion.ai)
- GitHub: LAION-AI/Open-Assistant
- HuggingFace Demo: Chat Llm Streaming – a Hugging Face Space by olivierdehaene
- Official Chat: chat (open-assistant.io) (Requires signup)
- Model Weights: OpenAssistant/oasst-sft-1-pythia-12b
- Dataset: OpenAssistant/oasst1
- Documentation: Introduction | Open Assistant (laion.ai)
- Research Paper: OpenAssistant Conversations — Democratizing Large Language Model Alignment
Don’t forget to give likes, stars, and hearts to the project. They deserve our love as they are doing this selflessly.
Abid Ali Awan (@1abidaliawan) is a certified data scientist professional who loves building machine learning models. Currently, he is focusing on content creation and writing technical blogs on machine learning and data science technologies. Abid holds a Master's degree in Technology Management and a bachelor's degree in Telecommunication Engineering. His vision is to build an AI product using a graph neural network for students struggling with mental illness.
- Facebook Open Sources a Chatbot That Can Discuss Any Topic
- Open Data and Why it is Necessary
- The 7 Best Open Source AI Libraries You May Not Have Heard Of
- DataOps Summit 2021 CFP Is Now Open!
- Developing an Open Standard for Analytics Tracking
- Top Open Source Large Language Models