Google Launches Gemini — a Powerful AI Model It Says Can Surpass GPT-4
On Wednesday, Google unveiled its latest creation, Gemini, a multimodal AI model family designed to compete with OpenAI’s GPT-4. Google claims that Gemini surpasses GPT-4 in 30 of the 32 widely used academic benchmarks utilized in large language model (LLM) research and development. This release comes as a follow-up to PaLM 2, an earlier AI model developed by Google with the aim of matching GPT-4’s capabilities.
Gemini, much like its rival GPT-4, possesses the ability to handle multiple modes or types of input, allowing it to process text, code, images, and even audio. Google envisions Gemini as an AI technology that can reliably solve problems, offer advice, and provide answers in various domains, ranging from everyday inquiries to scientific undertakings. The company believes that such advancements will herald a new era in computing, and it plans to seamlessly integrate this technology into its existing products.
In an official statement, Google writes, Gemini 1.0’s sophisticated multimodal reasoning capabilities can help make sense of complex written and visual information. Its remarkable ability to extract insights from hundreds of thousands of documents through reading, filtering, and understanding information will help deliver new breakthroughs at digital speeds in many fields from science to finance.
Google plans to offer Gemini in three different sizes: Gemini Ultra, Gemini Pro, and Gemini Nano. The largest version, Gemini Ultra, is tailored for highly complex tasks, while Gemini Pro is designed to scale across a wide range of tasks. On the other hand, Gemini Nano is designated for on-device tasks, such as those performed on Google’s Pixel 8 Pro smartphone. Each size is differentiated by its parameter count, with larger models possessing more parameters, thereby affording them greater computational capabilities. Gemini Nano, the smallest model, can run locally on consumer devices, while Gemini Ultra necessitates data center hardware for operation.
Google CEO Sundar Pichai expressed his excitement about the possibilities presented by Gemini, stating, This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company. I’m genuinely excited for what’s ahead, and for the opportunities Gemini will unlock for people everywhere.
While Google plans to make Gemini available in three sizes, as of now, only the mid-level model is accessible for public use. Currently, Google Bard, an AI chatbot, utilizes a specially tuned version of the Gemini Pro model. Initial testing suggests that Gemini Pro outperforms its predecessor, which was based on Google’s PaLM 2 language model.
Google claims that Gemini delivers enhanced scalability and efficiency compared to its previous AI models when run on its custom Tensor Processing Units (TPU). On TPUs, Google states, Gemini runs significantly faster than earlier, smaller and less-capable models.
Additionally, Gemini showcases its prowess in coding tasks. Google has trained a coding-centric version of the model called AlphaCode 2, which excels in solving competitive programming problems that extend beyond coding to encompass complex mathematics and theoretical computer science.
With Gemini’s introduction, Google is poised to elevate the capabilities and possibilities of AI models. By pushing the boundaries of technological advancements, Google hopes to empower individuals across the globe and unlock new opportunities in numerous fields.
Note: The word count of the provided news article is 498 words.