In a remarkable development, Mixtral 8x7B, an open-source Mixture of Experts (MoE) large model, has emerged as a formidable rival to challenge OpenAI’s dominance in the AI landscape. With its exceptional capabilities, Mixtral 8x7B has the potential to surpass the prowess of Llama 2 70B and GPT-3.5, as indicated in its official evaluation report. This revelation has sparked intense discussions across various social media platforms.
Distinguishing itself from Llama 2 70B, Mixtral 8x7B utilizes a sparse model to achieve six times faster inference speeds in multiple benchmarks. Its Transformer comprises eight different feedforward blocks, and as a decoder-only model, it exhibits exceptional design with a parameter size of 46.7B.
What makes Mixtral 8x7B even more intriguing is the exponential growth of Mistral AI, the company behind this revolutionary model. In just six months since its inception, Mistral AI’s valuation has surged to over $2 billion, underscoring the rising significance of large MoE models in the AI landscape. Furthermore, Mistral AI’s unconventional approach of releasing the model for download before making an official announcement has garnered attention and challenged traditional model release patterns.
The remarkable performance of Mixtral 8x7B across diverse tasks such as language understanding and advanced mathematics has not gone unnoticed. The pricing structure offers flexibility to customers with varying needs and financial constraints, with the availability of small to larger cup sizes. Enthusiasts eager to test Mixtral’s performance on different platforms can take advantage of Mistral AI’s model download services for local deployment, with different users reporting varying speeds. On various hardware setups, some users can obtain 52 tokens per second.
Mixtral 8x7B’s impressive capabilities are apparent through its outperformance of Llama 2 70B in various benchmarks, showcasing a sixfold improvement in inference speed. Additionally, it boasts a 32k context window and offers multilingual support for English, French, Italian, German, and Spanish. The model also delivers excellent performance in code generation tasks. Mistral AI has released Mixtral in conjunction with the Instruct version, optimized for specific tasks, which has achieved competitive scores as well.
Responding to Mixtral’s release, Andrej Karpathy of OpenAI referred to it as a medium cup, raising questions about OpenAI’s position amidst the changing AI landscape. Conversely, Jim Fan, an AI expert at NVIDIA, commended Mistral AI for carving a niche in a crowded field of nascent models. This rapid rise of Mistral AI, combined with the successful model reuse analysis by Princeton doctoral student Tianle Cai, signifies a paradigm shift within the open-source AI community. As Mixtral 8x7B challenges established models’ dominance, such as GPT-3.5, many speculate that OpenAI has no moat, heralding a new era of AI innovation and collaboration.
The emergence of Mixtral 8x7B marks a significant milestone in the AI landscape, with its unparalleled capabilities and potential to surpass existing models. This groundbreaking development signifies a shift in the balance of power between AI competitors and explores uncharted territories of limitless possibilities. As Mistral AI continues to revolutionize the open-source AI community, the world eagerly awaits the next breakthrough that will shape our future.