# Audio2Head **Repository Path**: xinca/Audio2Head ## Basic Information - **Project Name**: Audio2Head - **Description**: Audio2Head是基于一张参考照片,和说话音频,生成口播视频 one-shot talking head 兼顾生成的韵律和外表的相似,除了面部,考虑到了头部的动作,虑了背景区域的artifact - **Primary Language**: Python - **License**: Not specified - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2024-03-20 - **Last Updated**: 2024-03-20 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion (IJCAI 2021) #### [Paper](https://www.ijcai.org/proceedings/2021/0152.pdf) | [Demo](https://www.youtube.com/watch?v=xvcBJ29l8rA) #### Requirements - Python 3.6 , Pytorch >= 1.6 and ffmpeg - Other requirements are listed in the 'requirements.txt' #### Pretrained Checkpoint Please download the pretrained checkpoint from [google-drive](https://drive.google.com/file/d/1tvI43ZIrnx9Ti2TpFiEO4dK5DOwcECD7/view?usp=sharing) and put it within the folder (`/checkpoints`). #### Generate Demo Results ``` python inference.py --audio_path xxx.wav --img_path xxx.jpg ``` Note that the input images must keep the same height and width and the face should be appropriately cropped as in `/demo/img`. #### License and Citation ``` @InProceedings{wang2021audio2head, author = Suzhen Wang, Lincheng Li, Yu Ding, Changjie Fan, Xin Yu title = {Audio2Head: Audio-driven One-shot Talking-head Generation with Natural Head Motion}, booktitle = {the 30th International Joint Conference on Artificial Intelligence (IJCAI-21)}, year = {2021}, } ``` #### Acknowledgement This codebase is based on [First Order Motion Model](https://github.com/AliaksandrSiarohin/first-order-model), thanks for their contribution.