Dolly - Free Open Source AI Model in the Style of ChatGPT

On Wednesday, Databricks released an innovative new development, Dolly 2.0, the first freely available, instruction-based large language model (LLM) for commercial use and fine-tuned on a human-generated data set. It represents a spark in the language model universe, providing a foundation for developing ChatGPT-style competitors.

Databricks, an American enterprise software company founded in 2013 by the creators of Apache Spark, enables organizations to create and customize LLMs without repercussions such as need to pay for API access or sharing data with third parties.

Dolly 2.0 is a 12-billion-parameter model, originally based on EleutherAI’s pythia model family and solely fine-tuned with Databricks-dolly-15K, a data set crowdsourced from Databricks personnel. This fine-tuning gives it capabilities closer to OpenAI’s ChatGPT, a model capable of properly answering questions and engaging realistically with conversations.

In March of this year, Databricks began their journey with the release of Dolly 1.0, which was hampered by limitations due to the training data featuring ChatGPT outputs, which required users to adhere to OpenAI’s terms of service.

The Databricks team then decided to take on the colossal task of creating a new data set to enable commercially accessible LLMs; a 13,000-demonstration data set crowdsourced from over 5,000 employees, who were encouraged by a participating competition. The tasks for data generation included open Q&A, closed Q&A, summarizing from Wikipedia, brainstorming, classification, and creative writing.

The data set, model weights and training code were released with a Creative Commons license, allowing any commercial use with modifications and extensions. This is beneficial to organizations in comparison to OpenAI’s ChatGPT, which demand users to pay for API access, and Meta’s LLaMA, which is only partially open source and forbids commercial use.

AI researcher Simon Willison deemed the launch of Dolly 2.0 a “really big deal” and commended Databricks for the fine-tuned instruction set created by the 5,000 Databricks personnel members and openly released with Creative Commons license.

The potential of Dolly 2.0 is absolutely astounding; it could potentially spark a new wave of open source language models free from the shackles of proprietary limitations and restrictions on commercial use. Furthermore, further refinements may allow for local consumer-class machines to enjoy the power of these finely-tuned language models.

Databricks LLC is a software company founded by the original creators of Apache Spark — an open-source distributed computing platform designed for processing large datasets. It provides a web-based platform designed with development and distributed processing of big data in mind, featuring support for a variety of languages, libraries, APIs, and other technologies.

Simon Willison is a venture capitalist and AI researcher. He conducts experiments with open source language models, including Dolly. Willison’s comments on the release of Dolly 2.0 created great anticipation for the potential of Open Source language models, summed in his words: “Even if Dolly 2 isn’t good, I expect we’ll see a bunch of new projects using that training data soon. And some of those might produce something really useful.”

The Dolly 2.0 weights are available on Hugging Face and the databricks-dolly-15k data set is free for download from GitHub. It is an exciting time for large language models, with the potential of unlimited possibilities enabled by freely available, open source AI.

Dolly – Free Open Source AI Model in the Style of ChatGPT

Frequently Asked Questions (FAQs) Related to the Above News

Subscribe

How to Use Chat GPT: Step by Step Guide to Start Open AI ChatGPT

Fascinating Facts on ChatGPT

ChatGPT Global News Offers Comprehensive AI-Powered News Coverage

An Overview of ChatGPT

Meet the Experts Who Trained ChatGPT

More like this
Related

Obama’s Techno-Optimism Shifts as Democrats Navigate Changing Tech Landscape

Tech Evolution: From Obama’s Optimism to Harris’s Vision

Tonix Pharmaceuticals TNXP Shares Fall 14.61% After Q2 Earnings Report

The Future of Good Jobs: Why College Degrees are Essential through 2031

About us

Company

The latest

Obama’s Techno-Optimism Shifts as Democrats Navigate Changing Tech Landscape

Tech Evolution: From Obama’s Optimism to Harris’s Vision

Tonix Pharmaceuticals TNXP Shares Fall 14.61% After Q2 Earnings Report

Subscribe

Dolly – Free Open Source AI Model in the Style of ChatGPT

Frequently Asked Questions (FAQs) Related to the Above News

Subscribe

More like thisRelated

About us

Company

The latest

Subscribe

More like this
Related