We introduce DynamicPose, a simple and robust framework for animating human images, specifically designed for portrait animation driven by human pose sequences. In summary, our key contributions are as follows:
We are committed to providing the complete source code for free and regularly updating DynamicPose. By open-sourcing this technology, we aim to drive advancements in the digital human field and promote the widespread adoption of virtual human technology across various industries. If you are interested in any of the modules, please feel free to email us to discuss further. Additionally, if our work can benefit you, we would greatly appreciate it if you could give us a star ⭐!
We Recommend a python version >=3.10
and cuda version =11.7
. Then build environment as follows:
# [Optional] Create a virtual env
python -m venv .venv
source .venv/bin/activate
# Install with pip:
pip install -r requirements_min.txt
pip install --no-cache-dir -U openmim
mim install mmengine
mim install "mmcv>=2.0.1"
mim install "mmdet>=3.1.0"
mim install "mmpose>=1.1.0"
You can download weights manually as follows.
Manually downloading: You can also download weights manually, which has some steps:
Download our trained weights, which include four parts: denoising_unet.pth
, reference_unet.pth
, pose_guider.pth
and motion_module.pth
.
rtmw-x_simcc-cocktail14_pt-ucoco_270e-384x288-f840f204_20231122.pth
, tmdet_m_8xb32-100e_coco-obj365-person-235e8209.pth
) and the corresponding scripts from mmpose repository.Finally, these weights should be orgnized as follows:
./pretrained_weights/
|-- rtmpose
| |--rtmw-x_simcc-cocktail14_pt-ucoco_270e-384x288-f840f204_20231122.pth
| |-- rtmw-x_8xb320-270e_cocktail14-384x288.py
| |-- rtmdet_m_640-8xb32_coco-person.py
| `-- rtmdet_m_8xb32-100e_coco-obj365-person-235e8209.pth
|-- DWPose
| |-- dw-ll_ucoco_384.onnx
| `-- yolox_l.onnx
|-- image_encoder
| |-- config.json
| `-- pytorch_model.bin
|-- denoising_unet.pth
|-- motion_module.pth
|-- pose_guider.pth
|-- reference_unet.pth
|-- sd-vae-ft-mse
| |-- config.json
| |-- diffusion_pytorch_model.bin
| `-- diffusion_pytorch_model.safetensors
`-- stable-diffusion-v1-5
|-- feature_extractor
| `-- preprocessor_config.json
|-- model_index.json
|-- unet
| |-- config.json
| `-- diffusion_pytorch_model.bin
`-- v1-inference.yaml
python -m scripts.pose2img --config ./configs/prompts/animation_stage1.yaml -W 512 -H 768
python -m scripts.pose2vid --config ./configs/prompts/animation_stage2.yaml -W 512 -H 784 -L 64
python data_prepare/video2pose.py path/to/ref/images path/to/save/results image #image
python data_prepare/video2pose.py path/to/tgt/videos path/to/save/results video #video
This work also has some limitations, which are outlined below:
When the input image features a profile face, the model is prone to generating distorted faces.
When the background is complex, the model struggles to accurately distinguish between the human body region and the background region.
When the input image features a person with objects attached to their hands, such as bags or phones, the model has difficulty deciding whether to include these objects in the generated output
code
: The code of DynamicPose is released under the MIT License.other models
: Other open-source models used must comply with their license, such as stable-diffusion-v1-5
, dwpose
, rtmpose
, etc..@software{DynamicPose,
author = {Yanqin Chen, Changhao Qiao, Bin Zou, Dejia Song},
title = {DynamicPose: A effective image-to-video framework for portrait animation driven by human pose sequences},
month = {August},
year = {2024},
url = {https://github.com/dynamic-X-LAB/DynamicPose}
}