Post-Image

Generating Talking Head Models From Text With Synced Voice

In this report I introduce a pipeline to animate a chatbot.

From just one single photo, and optionally, a reference voice, the animation pipeline is able to synthesize a realistic chatting head which can be dynamically prompted with text to generate talking sequences where the audio is synced to lip movements.

Resources

Code
Paper