Microsoft Research Asia’s latest experimental AI tool can produce some rather convincing deepfakes.
VASA-1 can take a still image or drawing of a person, combine it with an audio clip, and generate a lifelike talking face of them almost instantly.
The program generates realistic-looking results and can imitate things like a person’s facial emotions and head motions from a photograph, as well as move their lips in a way that makes it appear as if they are the one speaking or singing. The organization shared some footage of VASA-1 in action that are rather exciting to see.
A skilled eye may notice that the head motions are slightly robotic up close, but the results are rather convincing. This is one of the reasons why researchers are not publishing an online demo, API, or product that employs VASA-1 until they are confident it “will be used responsibly and in accordance with proper regulations.”
“We are opposed to any behavior that creates misleading or harmful content about real people, and we are interested in applying our technique to advance forgery detection,” the organization stated, adding that videos made with the program do not include detectable artifacts.
Finally, the researchers believe the tool might be used to provide companionship and therapeutic support to persons in need, as well as to provide a “person” to whom people can converse in situations where AI is utilized.
Earlier this week, Microsoft took its WizardLM-2 AI model down less than a day after it was released due to a lack of toxicity testing.