Battle for the AI Throne: Google’s Gemini Ultra Beats GPT-4 in Most Tests, but Microsoft Counters with Medprompt

Date:

Google’s new AI language model, Gemini, has taken the spotlight with its claims to outperform OpenAI’s GPT-4 in most tests. However, Microsoft has fired back, asserting that GPT-4 holds the upper hand when provided with the right prompts.

Gemini, now available in three versions – Nano, Pro, and Ultra – has generated excitement with its impressive capabilities. In 30 out of 32 commonly used tests, Gemini Ultra surpassed GPT-4, exhibiting superior performance across reading comprehension, math questions, Python coding, and image analysis. Though the differences between the models varied, with some tests only presenting marginal disparities, others demonstrated gaps as large as ten percentage points.

Gemini Ultra’s groundbreaking accomplishment lies in its triumph over human experts in massive multitask language understanding (MMLU) tests. It scored an impressive 90.0 percent, slightly surpassing a human expert’s score of 89.8 percent. Gemini’s triumph in such a diverse range of fields, including math, physics, medicine, law, and ethics, has fueled optimism for its potential applications.

While Google’s Gemini has sparked great interest, its roll-out will be gradual. Gemini Pro is now accessible to the public, integrated into Google’s chatbot Bard, while Gemini Nano is embedded in various functions on the Pixel 8 Pro smartphone. On the other hand, Gemini Ultra, still undergoing security testing, is currently only available to a limited number of developers, partners, and AI liability and security experts. Google plans to make it available to the public via Bard Advanced early next year.

However, Microsoft is not prepared to let Google’s claims go unchallenged. Microsoft researchers recently published their research on Medprompt, an innovative approach involving modified prompts or inputs that led to improved results in GPT-4. By leveraging Medprompt, Microsoft’s GPT-4 excelled in numerous tests, including the MMLU test, where it achieved a score of 90.10 percent. The battle for AI supremacy remains fierce, and it remains to be seen which language model will ultimately reign supreme.

See also  China Proposes Guardrails for AI Development to Ensure Safety

As Gemini and GPT-4 continue to vie for dominance, the future of AI hangs in the balance. With each model pushing the boundaries of language understanding and processing, the potential for groundbreaking advancements is within reach. The journey towards the AI throne is far from over, and the developments in this fierce competition will surely shape the landscape of AI technology moving forward.

Frequently Asked Questions (FAQs) Related to the Above News

What is Gemini?

Gemini is Google's new AI language model available in three versions - Nano, Pro, and Ultra - that has showcased superior performance in various tests compared to OpenAI's GPT-4.

How did Gemini Ultra perform against GPT-4?

In most tests, Gemini Ultra outperformed GPT-4, demonstrating better capabilities in reading comprehension, math questions, Python coding, and image analysis.

Was there a significant difference between Gemini Ultra and GPT-4?

While some tests showed marginal disparities, others exhibited gaps as large as ten percentage points, highlighting the varying degrees of superiority displayed by Gemini Ultra.

What is the most impressive accomplishment of Gemini Ultra?

Gemini Ultra scored 90.0 percent in massive multitask language understanding (MMLU) tests, surpassing even human experts by a slight margin.

What are the current availability plans for Google's Gemini?

Gemini Pro is now accessible to the public through Google's chatbot Bard, while Gemini Nano is embedded in various functions on the Pixel 8 Pro smartphone. Gemini Ultra, still undergoing security testing, is only available to a limited number of developers, partners, and AI liability and security experts but is expected to be available via Bard Advanced early next year.

How has Microsoft responded to Gemini's claims?

Microsoft has countered Gemini's claims by asserting that GPT-4 holds the upper hand in certain tests when provided with modified prompts or inputs, as demonstrated by their approach called Medprompt.

How did Microsoft's GPT-4 perform with the Medprompt approach?

Leveraging Medprompt, Microsoft's GPT-4 achieved improved results in various tests, including a score of 90.10 percent in the MMLU test, showcasing its potential.

What is the outlook for the battle between Gemini and GPT-4?

The battle for AI supremacy is fierce, and it remains to be seen which language model will ultimately prevail. Both models continue to push the boundaries of language understanding and processing.

How will the developments in this competition impact the future of AI?

The competition between Gemini and GPT-4 will shape the landscape of AI technology moving forward, potentially leading to groundbreaking advancements in language understanding and processing.

Please note that the FAQs provided on this page are based on the news article published. While we strive to provide accurate and up-to-date information, it is always recommended to consult relevant authorities or professionals before making any decisions or taking action based on the FAQs or the news article.

Share post:

Subscribe

Popular

More like this
Related

Albanese Government Unveils Aged Care Digital Strategy for Better Senior Care

Albanese Government unveils Aged Care Digital Strategy to revolutionize senior care in Australia. Enhancing well-being through data and technology.

World’s First Beach-Cleaning AI Robot Debuts on Valencia’s Sands

Introducing the world's first beach-cleaning AI robot in Valencia, Spain - 'PlatjaBot' revolutionizes waste removal with cutting-edge technology.

Threads Surpasses 175M Monthly Users, Outpaces Musk’s X: Meta CEO

Threads surpasses 175M monthly users, outpacing Musk's X. Meta CEO announces milestone in social media app's growth.

Sentient Secures $85M Funding to Disrupt AI Development

Sentient disrupts AI development with $85M funding boost from Polygon's AggLayer, Founders Fund, and more. Revolutionizing open AGI platform.