First establish the environment and install the dependencies
conda create --prefix env/ python=3.8 -y
conda activate env/
pip install -r requirements.txt
Then download files from this link and put:
Obama2.zip
andAPC_epoch_160.model
insrc/face_generator/data
and extractObama2.zip
there.GPEN-BFR-512_trace.pt
,RealESRGAN_x2plus_trace.pt
, andRetinaFace-R50_trace.pt
insrc/face_res/models
.wiki.zip
insrc/face_res
and extract it there.00000189-checkpoint.pth.tar
insrc/face_reenactment/config
.shape_predictor_68_face_landmarks.dat
insrc/style_metrics
.RAVDESS.zip
in.
and then extract it there.
Simply run
bash main.sh -i <image path> -a <audio path> -o <output path>
The model only accepts audio with extension .wav
or .mp3
, and the image must be square.
For example with the given inputs
folder, you run:
bash main.sh -i inputs/image.jpg -a inputs/sample.wav -o ./output
- train (coming soon)