May 21, 2024


Technology/Tech News – Get all the latest news on Technology, Gadgets with reviews, prices, features, highlights and specificatio

Microsoft VASA-1: With images and sound to create a talking image with AI technology

Microsoft VASA-1: With images and sound to create a talking image with AI technology

The Microsoft Research authors provide a number of examples in the project's blog post.

The perception of images as a moving person is based on global facial dynamics, as Microsoft calls it. This means that facial expressions and gestures are realistically integrated and adapted to voice input. In this way, the AI ​​in the background provides a simple and multimedia option to communicate with third parties with a dynamic vision that is in tune with the times. The team explains:

[…] Our method not only provides high video quality with realistic face and head dynamics, but also supports online creation of 512 x 512 videos at up to 40 fps with negligible startup latency. It paves the way for real-time interactions with lifelike avatars that mimic human conversational behaviors.

The functionality is not yet available to the public. Microsoft also states that only images created by DALL-E 3 were used in the demo and that no real people were imitated. The form should only be made public when Microsoft can ensure it is safe to use. However, it is questionable whether illicit uses of these forms can be ruled out.

AI helps creatives, recruiters and their partners.

In addition to Microsoft, Meta, for example, already provides options to create AI-based influencer chatbots that can interact with fans and followers instead of the creator and enable increased efficiency. While this opens the door for social media managers, on LinkedIn for example, as well as for recruiters or those who simply want to maintain engagement with accounts and other people, the question arises about the validity of these communication processes.

See also  Vivaldi 6.7 tests Privacy Guard (VPN) in collaboration with Invisv with new technology

In the long term, talking faces could also be available on websites and in profiles as a visual avatar that can be manipulated at any time and potentially improve the user experience. AI technology still has many updates up its sleeve. Many other technology companies are currently working on developments such as combining audio and visuals, from Pika Labs to OpenAI to Adobe.


Big AI update for Premiere Pro

– Also coming soon with OpenAI and Runway features?

© Emily Bernal - Unsplash, Abstract colorful graphics screen, dark room, incandescent lamp, small screen with Adobe services icons
© Emily Bernal – Unsplash