September, Thursday 19, 2024

Google Acknowledges Manipulation of AI Viral Video to Enhance Visuals


pPXSwq3rdgue4P0.png

Google's impressive video showcasing the capabilities of its artificial intelligence (AI) model, known as Gemini, has been revealed to be misleading. The video, which garnered 1.6 million views on YouTube, demonstrated the AI's ability to respond in real time to spoken-word prompts and video. However, Google admitted that the responses in the video were actually sped up for the demonstration's purposes. In a blog post published alongside the video, Google provided an explanation of how the video was created. Bloomberg later confirmed that the AI was prompted using still image frames from the footage, along with text prompts. A Google spokesperson stated that the video accurately displayed the prompts and outputs from Gemini, and that it was intended to showcase the AI's capabilities and inspire developers. The video features a person asking Google's AI a series of questions while presenting objects on the screen. Upon closer inspection, it becomes apparent that the AI was not responding directly to the voice or video prompts. For instance, when shown a rubber duck and asked if it would float, the AI correctly identified the object by being shown a still image of the duck and receiving a text prompt about the squeaking sound it makes when squeezed. Another impressive moment in the video occurs when the person performs a cups and balls magic trick, and the AI determines the location of the hidden ball. However, this feat was accomplished by showing the AI a series of still images rather than a video. Google clarified that the footage from the video was used to create the demo and test Gemini's capabilities. While sequences were shortened and still images were utilized, the voiceover in the video was taken directly from the written prompts provided to Gemini. The video also featured an element where the AI appeared to invent a game called "guess the country" based on clues from a world map. In reality, the AI was given specific instructions and examples to generate clues and determine the correct country based on still images of the map. Although Google's AI model is impressive, its use of still images and text-based prompts puts its capabilities on par with OpenAI's GPT-4. Interestingly, the video was released shortly after a period of turmoil in the AI industry following Sam Altman's controversial removal and subsequent rehiring as CEO of OpenAI. While it remains unclear which AI model is more advanced, Google may be playing catch-up according to Altman's statement that OpenAI is working on the next version of its AI.