Hyderabad: Google has unveiled its latest AI model, Gemini, which aims to redefine the way text, images, video, audio, and coding are processed. With three versions available, namely Ultra, Pro, and Nano, Gemini is designed to operate seamlessly across different information types and is set to revolutionize various industries.
Gemini’s capabilities were showcased in demos, showcasing its ability to perceive like the human eye, evaluate real-time data, and suggest actions. The largest model, Gemini Ultra, is tailored for highly complex tasks, while Gemini Pro excels at scaling across a wide range of tasks. Gemini Nano, on the other hand, is designed for on-device functions and has already been integrated into Pixel 8 Pro for features like Summarize in the Recorder app and Smart Reply via Gboard on WhatsApp.
Google plans to expand Gemini’s integration into its products and services, including Search, Ads, Chrome, and Duet AI. Sundar Pichai, Alphabet and Google CEO, expressed his excitement, stating that these models are the first realization of the vision they had when they formed Google DeepMind earlier this year.
Gemini Ultra, the largest model, has already outperformed existing large language models on 30 out of 32 widely used academic benchmarks, according to Demis Hassabis, CEO of Google DeepMind. Additionally, Gemini Ultra boasts a score of 90 percent on the massive multitask language understanding benchmark, outperforming human experts. This benchmark assesses knowledge and problem-solving skills across 57 subjects such as math, physics, history, law, medicine, and ethics.
The flexibility of Gemini is a key highlight, as it can efficiently run on both data centers and mobile devices. This versatility empowers developers to build and scale AI applications, with hopes of creating AI models that feel like expert helpers or assistants, providing useful and intuitive interactions.
Gemini’s multimodal reasoning capabilities enable it to comprehend complex written and visual information, extracting insights from extensive datasets. Its first version demonstrates proficiency in understanding nuanced information, answering complex questions, and generating high-quality code.
Google plans to roll out Gemini in stages, starting with its integration into the chatbot Bard for English language settings. Developers can gain access to Gemini through Google Cloud’s API from December 13th. Furthermore, Gemini’s compact version will power suggested messaging replies on Pixel 8 smartphones, with further integration into Google products expected in the coming months. The most powerful version of Gemini is set to debut in 2024, pending thorough trust and safety checks.
To ensure the responsible use of AI, Alphabet, Google’s parent company, is implementing new protections aligned with safety policies and AI principles.
With Gemini’s arrival, Google has once again claimed its spot in the fiercely competitive realm of Artificial Intelligence. This latest AI model is set to redefine the boundaries of what’s possible in processing text, images, video, audio, and coding, promising groundbreaking applications in various industries.