Building Generative AI Models: Insights from MosaicML and VB Transform 2023
Enterprises are still navigating the emerging field of large language models (LLMs) and generative AI systems. While options like OpenAI and fine-tuning existing models exist, building customized models from scratch is also becoming a popular choice for forward-thinking companies. However, the concept of blending or mixing and matching models is not yet well understood by many.
According to Naveen Rao, founder and CEO of MosaicML, this lack of understanding is expected, given the newness of these technologies in the mainstream. In a fireside chat with Matt Marshall, founder of VentureBeat, Rao highlighted the rapid transition and adoption of large language models and Generative Pre-trained Transformers (GPT) within the past nine months.
MosaicML, a company that helps enterprises train and deploy LLMs and other generative AI models, recently made headlines for its acquisition by Databricks. The acquisition, valued at $1.3 billion, showcased the potential and value of MosaicML’s technology. The startup released its MPT-7B model in May, which was built with a price tag of $200,000.
Rao emphasized that these models do not need the capability to delve into philosophical topics like the fall of Rome. Instead, organizations should focus on ensuring the general capabilities and correctness of models for their specific use cases. He added that OpenAI has not necessarily built models with those specific needs in mind.
Many organizations are still in the data-gathering phase, and the next step is figuring out how to activate that data with AI. Rao advised enterprises to pre-train and incorporate their own data into existing models, as building models for every domain is a difficult task for one provider. In order to achieve this, organizations need to empower domain experts with the capability to build models in their respective fields.
MosaicML has witnessed early adopters successfully putting models into production and gathering user feedback. This iterative process allows for continuous innovation and improvement within the field of generative AI.
From its inception in 2023, MosaicML has focused on simplifying the training of large models by creating a stable, cross-cloud interface. The company has reached 50 customers with an investment of only $35 million. Rao explained that MosaicML is selective in choosing its customers, ensuring they have strong teams and well-structured data.
Rao noted that MosaicML was already familiar with the potential of models like ChatGPT before they gained widespread popularity. He acknowledged the entertainment aspect of chatbots and admitted that he initially believed they would not have a significant impact until his teenage children started discussing them.
Looking ahead, Rao believes it will take a few more years for traditional enterprises to fully embrace the use of generative AI models. However, he sees fintech as an early adopter, with healthcare also starting to leverage these technologies. The most common use cases will involve enhancing consumer experiences, providing personalized and context-driven search results, and supporting automation in various industries.
Rao emphasized that the pace of change in the AI field is currently very high, and the integration of generative AI will enhance, rather than replace, jobs. Co-pilots for lawyers, doctors, and other professions will become a reality, offering invaluable support.
Regarding the Databricks acquisition, Rao stated that while he was not actively seeking a buyer, the synergy between MosaicML and Databricks was strong. MosaicML’s technology seamlessly complements Databricks’ existing enterprise software, serving over 10,000 customers.
Rao concluded by expressing MosaicML’s hunger to be at the forefront of the industry and its commitment to winning. The company aims to provide innovative solutions and be a leader in the rapidly evolving field of generative AI.