A self-proclaimed artist, idiot, and maker has remade Google’s now discredited Gemini demo using technology from its most obvious AI rival, ChatGPT. Greg Technology published a short video wherein the eponymous tech tinkerer discussed a drawing of a duck, asked about some hand-signal emojis, and got OpenAI’s GPT-4V AI to identify a game that was being played. Greg’s video might lacks the polish of the Gemini AI demo, but it truly mixes voice and vision prompts in real-time throughout.
For some context to the video recording from Greg Technology, it is worth a look at Google’s Gemini AI launch video titled “Hands-on with Gemini.” On launch day, this was the flagship video, claimed to be the best way to understand “Gemini’s underlying amazing capabilities is to see them in action,” according to Google CEO Sundar Pichai.
It soon transpired that the impressively cute and slick Google Gemini AI video was staged. The main issue that caused disappointment among AI watchers was that the video presented wasn’t recorded in real-time – instead, Gemini responded to a series of still images. Additionally, all the voice interaction was dubbed in later as part of the video production process, whereas Gemini had actually responded to text prompts throughout the demo.
Above, you can see the Greg Technology real-time demo, which replicates some of the key sections of the Gemini AI “hands-on” escapade. Greg provides a preamble to the action during the first half of the clip. In brief, he recalls seeing the “super exciting” Gemini video, with its to-and-fro between the presenter talking and doing things – with an AI robot voice demonstrating its understanding of what was happening. In Greg’s opinion, Google had produced “not a real kind of honest demo.”
The thorny situation for Google sparked Greg to wonder if he could do his own “Remake of the Google Gemini fake demo, except using GPT-4, and it's real.” Hence, the title of the embedded video.
An important update to GPT-4 arrived in recent weeks, with a vision extension available. Greg thought that with GPT-4V, he could remake the Gemini AI demo, and you can see him walking through a few of the same AI stretching exercises in the second half of his video. One of the things we see/hear during the Greg Technology is the pregnant gap between the user voice prompt and GPT-4V making its verbal response. Google’s “Hands-on with Gemini” demo video was launched with a disclaimer saying, “latency has been reduced, and Gemini outputs have been shortened for brevity.” But we sadly learned the demo showreel went through much more post-processing and editing than that.
Greg Technology made his demo code available via GitHub.
Stay on the Cutting Edge
Join the experts who read Tom's Hardware for the inside track on enthusiast PC tech news — and have for over 25 years. We'll send breaking news and in-depth reviews of CPUs, GPUs, AI, maker hardware and more straight to your inbox.
Mark Tyson is a Freelance News Writer at Tom's Hardware US. He enjoys covering the full breadth of PC tech; from business and semiconductor design to products approaching the edge of reason.