Google Admits Editing AI Video as It Pursues OpenAI
But on most of the benchmarks, Gemini Ultra beat OpenAI’s GPT-4 model by only a few percentage points. In other words, Google’s top AI model has only made narrow improvements on something that OpenAI completed work on at least a year ago.
And Ultra is still under wraps. If it’s released in early January, as Google has suggested, Gemini Ultra might not stay the top model for very long. In the time it has taken Google to catch up to OpenAI, the nimbler player has had almost a year to work on its next AI model, GPT-5.
Then there’s the video demo that technologists described as jaw-dropping on X, the site formerly known as Twitter:
On first viewing, this is impressive stuff. The model’s ability to track a ball of paper from under a plastic cup, or to infer that a dot-to-dot picture was a crab before it is even drawn, show glimmers of the reasoning abilities that Google’s DeepMind AI lab have cultivated over the years.
That’s missing from other AI models. But many of the other capabilities on display are not unique and can be replicated by ChatGPT Plus, as Wharton professor Ethan Mollick has demonstrated.
Google also admits that the video is edited. For the purposes of this demo, latency has been reduced and Gemini outputs have been shortened for brevity, it states in its YouTube description. This means the time it took for each response was actually longer than in the video.
In reality, the demo also wasn’t carried out in real time or in voice. When asked about the video by Bloomberg Opinion, a Google spokesperson said it was made by using still image frames from the footage and prompting via text, and they pointed to a site showing how others could interact with Gemini with photos of their hands, or of drawings or other objects. In other words, the voice in the demo was reading out human-made prompts they’d made to Gemini, and showing them still images. That’s quite different from what Google seemed to be suggesting: that a person could have a smooth voice conversation with Gemini as it watched and responded in real time to the world around it.
The video also doesn’t specify that this demo is (probably) with Gemini Ultra, the model that’s not here yet. Fudging such details points to the broader marketing effort here: Google wants us to remember that it’s got one of the largest teams of AI researchers in the world and access to more data than anyone else. It wants to remind us how vast its deployment network is by bringing less-capable versions of Gemini to Chrome, Android, and Pixel phones.
But being everywhere isn’t always the advantage it seems in tech. Early mobile kings Nokia Oyj and Blackberry learned that the hard way in the 2000s when Apple jumped in with the iPhone, a more capable and intuitive product, and ate their lunches. In software, market success comes from having the best-performing systems.
Google’s showboating is almost certainly timed to capitalize on all the recent turmoil at OpenAI. When a board coup at the smaller AI startup temporarily ousted CEO Sam Altman and put the company’s future in doubt, Google swiftly launched a sales campaign to persuade OpenAI’s corporate customers to switch to Google, according to a report in The Wall Street Journal. Now it seems to be riding that wave of uncertainty with the launch of Gemini.
But impressive demos can only get you so far, and Google has demonstrated uncanny new tech before that didn’t go anywhere. (Remember Duplex?) Google’s gargantuan bureaucracy and layers of product managers have kept it from shipping products as nimbly as OpenAI till now. As society grapples with AI’s transformative effects, that’s no bad thing. But take Google’s latest show of sprinting ahead with a pinch of salt. It’s still coming up from behind.