Explorer

VLOGGER Is Google's Image-To-Video AI Tool That Can Be Controlled By Voice

Even as Google VLOGGER is not yet accessible for testing, as per mutiple reports, the demonstration hints at its potential to enable users to create and command avatars using voice commands.

With Artificial Intelligence (AI) being the hottest buzzword in tech, search engine giant Google's researchers have been busy, unveiling a string of innovative models and concepts. Their latest creation involves transforming a static image into a manipulable avatar, following their recent advancements in game-playing AI. Even as Google VLOGGER is not yet accessible for testing, as per mutiple reports, the demonstration hints at its potential to enable users to create and command avatars using voice commands.

Also read: Motorola Edge 50 Pro Is The Company's First AI-Powered Phone. Check Specifications, Features, Offers, More

However, a user named Madni Aghadi (@hey_madni) has posted on X, formerly Twitter: "Google just dropped VLOGGER, and it's crazy. This is going to transform the future of VIDEO forever. Here’s everything you need to stay ahead of the curve: 🧵 👇."

It should be noted that the image posted by Aghadi is a mockup and not real. VLOGGER is the tech giant's research project that may be able to "make photos come alive" via AI in future. While existing tools like Pika Labs' lip sync, Hey Gen's video translation services and Synthesia offer similar functionalities to some degree, Google VLOGGER appears to offer a more straightforward, bandwidth-friendly alternative.

What Is VLOGGER

Currently, VLOGGER remains a research endeavour featuring a few entertaining demo videos. However, should it evolve into a product, it has the potential to revolutionise communication on platforms like Teams or Slack.

This AI model has the capability to generate a dynamic avatar from a static image while preserving the photorealistic appearance of the individual throughout every frame of the resulting video.

Also read: OnePlus 11 Gets Big Price Cut On Amazon. Check Out The Offer And More

Moreover, the model integrates an audio file of the individual speaking, orchestrating body and lip movements to mirror the natural gestures and expressions that the person would exhibit if they were speaking in real life. This also includes generating head movements, facial expressions, eye movements, blinking, as well as hand gestures and upper body motions, all without relying on any additional references beyond the provided image and audio.

A Github post further explains VLOGGER (in abstract) as follows: "We propose VLOGGER, a method for text and audio-driven talking human video generation from a single input image of a person, which builds on the success of recent generative diffusion models."

Our method consists of 1) a stochastic human-to-3d-motion diffusion model, and 2) a novel diffusion based architecture that augments text-to-image models with both temporal and spatial controls. This approach enables the generation of high quality videos of variable length, that are easily controllable through high-level representations of human faces and bodies. In contrast to previous work, our method does not require training for each person, does not rely on face detection and cropping, generates the complete image (not just the face or the lips), and considers a broad spectrum of scenarios (e.g., visible torso or diverse subject identities) that are critical to correctly synthesize humans who communicate.

View More
Advertisement
Advertisement
25°C
New Delhi
Rain: 100mm
Humidity: 97%
Wind: WNW 47km/h
See Today's Weather
powered by
Accu Weather
Advertisement

Top Headlines

Are Hindi Names Of 3 New Criminal Laws Against Constitution? What Law Says On Official Language
Are Hindi Names Of 3 New Criminal Laws Against Constitution? What Law Says On Official Language
NEET UG Re-Exam 2024 Results Declared On exams.nta.ac.in, Here's Direct Link
NEET UG Re-Exam 2024 Results Declared On exams.nta.ac.in, Here's Direct Link
'Utmost Primacy For Safety Of Women': Himanta Calls New Criminal Laws 'Watershed Moment' Of India's Justice System'
'Utmost Primacy For Safety Of Women': Himanta Calls New Criminal Laws 'Watershed Moment' Of India's Justice System'
LPG Prices: Govt Slashes Commercial 19 Kg LPG Cylinder Prices By Rs 30, Effective July 1
LPG Prices: Govt Slashes Commercial 19 Kg LPG Cylinder Prices By Rs 30, Effective July 1
Advertisement
ABP Premium

Videos

Jammu and Kashmir: Terrorists' Associate Gets Caught, Large Quantity of Arms and Ammunition SeizedMathura Tank Collapse: CM Yogi Takes Action, Orders Investigation into Mathura Tank IncidentUproar Likely in Parliament Session Today, Watch Video To Know Why Opposition May Corner GovernmentLonavala News: Major Accident Amid Heavy Rain in Lonavala, 5 Family Members Swept Away

Photo Gallery

Embed widget