Ph.D candidate @ HKUST-GZ, supervised by Prof. Harry SHUM and Prof. Lionel NI.
Selective Projects
Huawei · Face Blur Detection, Virtual Makeup (2019)
Tencent AI Lab · Emotion Classification, Neural Machine Translation (2020-2021)
Xiaobing · Multimodal Conversation, Audio-driven Talking Head Generation (2021-2022)
Step Fun · Text-to-Video Generation, Autoregressive Video Generation (2024)
We introduce a novel framework for one-shot audio-driven talking head generation. Unlike prior works that require additional driving sources for controlled synthesis in a deterministic manner, we instead sample all holistic lip-irrelevant facial motions (i.e. pose, expression, blink, gaze, etc.) to semantically match the input audio while still maintaining both the photo-realism of audio-lip synchronization and overall naturalness.