Meta, the social media giant formerly known as Facebook, is making a bold move to catch up with its rivals in the field of artificial intelligence (AI). The company is ramping up its AI efforts to compete with industry leaders like Google, Microsoft, and OpenAI. Meta has recently unveiled a new text-to-image model called CM3leon, which it claims achieves state-of-the-art performance in generating images from text prompts.
CM3leon represents a significant breakthrough for Meta’s AI capabilities. Not only can it generate high-quality images from text descriptions, but it can also provide coherent captions for existing images. This lays the foundation for more advanced image understanding models in the future.
To advance its state-of-the-art models, Meta is leveraging its strong data science team and computing infrastructure. While diffusion-based AI models have been making headlines, Meta is placing its bets on autoregressive transformer architectures, similar to the technology used by ChatGPT. According to Meta, CM3leon requires only a fraction of the training compute used by other comparable methods, making it more efficient.
In head-to-head comparisons, CM3leon appears to outperform models like OpenAI’s DALL-E 2 and MidJourney in handling complex objects and constraints in text prompts. Images shared by Meta demonstrate that its text-to-image generator accurately represents the human anatomy, avoiding the infamous spaghetti hands commonly seen in AI-generated images. Furthermore, CM3leon can render accurate text, eliminating the problem of random words appearing in AI images.
In addition to text-to-image capabilities, CM3leon also offers advanced features such as image editing, segmentation-to-image conversion, and super-resolution upscaling. These features are not available in any other generator, except for Stable Diffusion using Controlnet.
Furthermore, Meta is reportedly planning to release a commercial version of its LLaMA natural language model to outside developers. This move would enable startups and enterprises to build custom applications powered by Meta’s AI, putting the company in direct competition with ChatGPT, Bard (Google), and Claude v2 (Anthropic-Google).
Despite previously emphasizing its focus on metaverse projects, Meta now appears to be strongly pivoting towards AI across all its applications. The company has established a dedicated generative AI unit led by Chief Product Officer Chris Cox. Additionally, Meta is developing AI tools to improve targeted advertising.
In a departure from the closed-off approach of competitors like OpenAI, Meta has taken a more open-source approach by releasing key models like the leaked LLaMA LLM. This move aims to encourage global developers to innovate and improve the technology. However, the possibility of monetizing Meta’s models in the future remains open.
This surge of AI activity comes at a time when Meta faces challenges such as declining stock value and privacy and misinformation controversies associated with Facebook, its largest platform. Meta CEO Mark Zuckerberg believes that heavy investment in generative AI aligns with the company’s metaverse vision and could lead to new revenue streams.
In addition to its AI endeavors, Meta has also introduced Threads, a Twitter-like platform that is gaining rapid user growth, outpacing OpenAI’s ChatGPT after its launch. Meta has demonstrated its ability to take existing technologies, enhance them, and create successful products that outshine its competitors.
With promising performance from new models like CM3leon, Meta is determined to aggressively pursue AI and shape its future trajectory after its metaverse ventures failed to impress investors. The race to dominate generative AI has gained an exciting new participant.