The open-source AI chatbot OpenAssistant has been released recently with the slogan “Conversational AI for Everyone”. It is the first open-source, instruction-based model created from real human data. OpenAssistant is an open alternative to ChatGPT, the well-known conversational AI model. This project was orchestrated by the Large-Scale Artificial Intelligence Network (LAION) e. V., the same association which has developed another open AI system called Stable Diffusion, released in August 2022.
OpenAssistant was developed by Andreas Köpf and popular technology influencer Yannic Kilcher. Through the cooperation with LAION, they have created a valuable dataset to train AI models, incorporating more than 600,000 inputs and feedbacks. With this dataset and the models they have created, it becomes possible to develop even more contemporary applications with AI techniques. The dataset and the accompanying models can be accessed by developers for free, and they are available on Hugging Face under the Apache 2.0 license, with the exception of the models based on LLaMA, which are yet to be released due to licensing issues.
LAION is a non-profit organization founded in 2019, with the goal to provide the public with open-source AI tools. It is dedicated to democratizing the development of AI models and increasing the transparency of AI models and the organizations behind them. To date, LAION has succeeded in creating high quality datasets for image synthesis, and now, for conversational domain with OpenAssistant. They also provide tools for training and understanding AI models in a way that is easily accessible to everyone.
Andreas Köpf and Yannic Kilcher are the two driving forces behind OpenAssistant. Köpf is an AI entrepreneur and a founding member of the LAION e.V. association. Yannic Kilcher is an influential tech vlogger and a digital innovator with a focus on Artificial Intelligence and Machine Learning. Together, they have teamed up to collect data and feedback over the past months in order to create a high-quality dataset for training AI models.