Google Unveils Multimodal Language Model Gemini: Bard, Pixel Pro, and More

Date:

Google announced today the launch of Gemini, its new multimodal large language model developed by the AI division, DeepMind. Gemini aims to compete with OpenAI’s ChatGPT and will serve as the foundation for Google Bard, a chatbot that has struggled to gain recognition in the shadow of its competitor.

Gemini is unique among its AI counterparts because it was designed to be multimodal from the start, meaning it can handle text, audio, and image-based prompts. In a demo video, Gemini successfully identifies objects, infers actions in videos, generates music based on visual prompts, and even assesses children’s homework with a playful personality. However, it’s worth noting that the video description specifies that latency has been reduced and the Gemini outputs have been shortened for brevity.

According to Google’s CEO Sundar Pichai and DeepMind’s co-founder and CEO Demis Hassabis, there are three versions of Gemini available: Ultra, Pro, and Nano. The fine-tuned Gemini Pro supports Google Bard, while the Nano variant will be incorporated into products like Pixel Pro smartphones. In the coming months, Gemini will also be integrated into Google Search, Ads, and Chrome. However, public access to the Ultra version will not be available until 2024.

Gemini’s technical report reveals that its most powerful version, Ultra, outperforms current state-of-the-art benchmarks on 30 out of 32 widely-used academic benchmarks in the large language model research and development field. While the improvements may seem modest, with Gemini Ultra correctly answering multidisciplinary questions 90% of the time compared to ChatGPT’s 86.4%, it is clear that Gemini poses real competition for ChatGPT.

See also  Pediatric Diagnostic Tool Falls Short: ChatGPT's Accuracy Questioned by Medical Researchers, US

Despite its impressive capabilities, Google acknowledges that Gemini is not flawless and is susceptible to the industry-wide challenge of hallucinations, where the AI model occasionally generates incorrect or nonsensical responses. To address this, Google subjected Gemini to extensive safety evaluations, including testing its response to problematic inputs and assessing potential biases.

Google plans to gradually integrate Gemini into its suite of products, starting with closed testing phases. If all goes according to plan, the public can expect a Gemini Ultra-powered Bard Advanced release next year. Nevertheless, predicting the outcomes of the ongoing AI arms race remains challenging.

In a statement to PopSci, when asked if Bard is powered by Gemini, the chatbot responded that it does not have access to information regarding internal Google projects. It recommended searching for information through official Google channels or contacting someone within the company for more details.

With Gemini’s arrival, Google aims to position itself as a leading player in the realm of multimodal language models, offering innovative capabilities that expand beyond traditional text-based AI models. As Gemini continues to evolve and integrate into Google’s products, it will undoubtedly shape the future of AI-powered interactions and redefine the boundaries of what AI can achieve.

In conclusion, Gemini presents a significant milestone for Google’s AI division, DeepMind, with its multimodal capabilities and potential to challenge OpenAI’s ChatGPT. While Gemini is not without its flaws, its integration into Google’s suite of products holds promise for delivering enhanced user experiences and shaping the future of AI technology. As public access to the Ultra version remains on the horizon, the ongoing AI arms race continues to captivate both industry experts and curious observers eager to witness the next breakthrough in artificial intelligence.

See also  Canada's Government Addresses Copyright Challenges in the Age of AI

Frequently Asked Questions (FAQs) Related to the Above News

What is Gemini?

Gemini is a multimodal large language model developed by Google's AI division, DeepMind. It is designed to handle text, audio, and image-based prompts, and it aims to compete with OpenAI's ChatGPT.

What makes Gemini unique?

Gemini is unique because it was built to be multimodal from the start, allowing it to handle various types of inputs. It can identify objects, infer actions in videos, generate music based on visual prompts, and even assess children's homework with a playful personality.

What versions of Gemini are available?

There are three versions of Gemini available: Ultra, Pro, and Nano. The Pro variant supports Google Bard, while the Nano variant will be incorporated into products like Pixel Pro smartphones. However, public access to the Ultra version will not be available until 2024.

How does Gemini compare to OpenAI's ChatGPT?

Gemini's most powerful version, Ultra, outperforms current state-of-the-art benchmarks on 30 out of 32 widely-used academic benchmarks. It has a higher accuracy rate in correctly answering questions compared to ChatGPT.

What limitations does Gemini have?

Like other AI models, Gemini has limitations. It is susceptible to hallucinations, where it occasionally generates incorrect or nonsensical responses. To address this, Google has subjected Gemini to extensive safety evaluations and tests.

When will Gemini be integrated into Google products?

Google plans to gradually integrate Gemini into its suite of products. It will start with closed testing phases, and a Gemini Ultra-powered Bard Advanced release is expected next year.

What is Google's vision for Gemini?

Google aims to position itself as a leading player in multimodal language models with Gemini. It aims to offer innovative capabilities beyond traditional text-based AI models, shaping the future of AI-powered interactions and expanding the boundaries of what AI can achieve.

How can I get information about Gemini from Google?

According to Bard, the chatbot, it doesn't have access to information about internal Google projects. It recommends searching for information through official Google channels or contacting someone within the company for more details.

Please note that the FAQs provided on this page are based on the news article published. While we strive to provide accurate and up-to-date information, it is always recommended to consult relevant authorities or professionals before making any decisions or taking action based on the FAQs or the news article.

Share post:

Subscribe

Popular

More like this
Related

AI Revolutionizing Software Engineering: Industry Insights Revealed

Discover how AI is revolutionizing software engineering with industry insights. Learn how AI agents are transforming coding and development processes.

AI Virus Leveraging ChatGPT Spreading Through Human-Like Emails

Stay informed about the AI Virus leveraging ChatGPT to spread through human-like emails and the impact on cybersecurity defenses.

OpenAI’s ChatGPT Mac App Update Ensures Privacy with Encrypted Chats

Stay protected with OpenAI's ChatGPT Mac app update that encrypts chats to enhance user privacy and security. Get the latest version now!

The Rise of AI in Ukraine’s War: A Threat to Human Control

The rise of AI in Ukraine's war poses a threat to human control as drones advance towards fully autonomous weapons.