DeepMind Technologies Limited, known as Google DeepMind, has unveiled its latest generative AI model called Gemini 1.0 in three different sizes for various tasks: Ultra, Pro, and Nano. This launch has sparked a heated debate among influencers on social media platform X regarding its performance compared to GPT-4, according to GlobalData, a prominent data and analytics company.
The Social Media Analytics Platform of GlobalData has observed intense discussions among influencers on the capabilities and evaluations of Gemini AI. In particular, influencers have raised concerns about the evaluation criteria. GlobalData’s Social Media Analyst, Smitarani Tripathy, states that influencers perceive Gemini Ultra to be inferior to GPT-4 in standard 5-shot evaluations, with Gemini only surpassing GPT-4 when utilizing the CoT@32 methodology.
Tripathy highlights influencers’ skepticism regarding the practicality of CoT@32 in real-world scenarios and emphasizes GPT-4’s continued superiority. The importance of the MMLU benchmark is also emphasized, with influencers advocating for more transparent evaluations through API endpoints or model weights rather than relying solely on blog posts.
In order to provide an overview of influencer opinions, GlobalData’s Social Media Analytics Platform has captured a few popular quotes:
– One influencer pointed out that Gemini’s use of uncertainty routed chain of thought guided evaluation to claim a better MMLU score seemed incomplete. They further highlighted that GPT-4 outperformed Gemini in both the greedy and CoT@32 analyses.
– Another influencer expressed disappointment that Gemini Ultra only surpasses GPT-4 when using CoT@32, suggesting that Gemini’s inherent power should have enabled it to win in a 5-shot comparison.
– A different influencer emphasized the need for practical assessments and questioned whether Gemini truly beats GPT-4, considering the small difference in performance. They speculated that this could indicate the limits of language learning models or that Google’s goal was merely to surpass GPT-4.
– Digging into the MMLU benchmark, an influencer noted that Gemini doesn’t truly beat GPT-4 on this key benchmark. Gemini’s MMLU beat is specific to CoT@32, whereas GPT-4 still outperforms Gemini in the standard 5-shot evaluation.
Overall, influencers are skeptical about Gemini’s capabilities and are calling for practical assessments to ascertain its true potential. They also emphasize the importance of direct 5-shot vs. 5-shot comparisons for a more straightforward evaluation.
As the debates continue, it remains to be seen how Gemini AI will fare against GPT-4 in different evaluation scenarios. The influencers’ differing opinions reflect the complexity of evaluating AI models and the ongoing quest for advancements in the field.