From 549d245764e5fbf558cdb8545d49c5768ac4a9eb Mon Sep 17 00:00:00 2001
From: Artem Lukoianov
Date: Sat, 31 Aug 2024 18:20:11 -0400
Subject: [PATCH 01/14] NEW: sdi partially implemented
---
README.md | 767 ++----------------
configs/sdi.yaml | 123 +++
threestudio/models/guidance/__init__.py | 1 +
.../guidance/stable_diffusion_sdi_guidance.py | 565 +++++++++++++
threestudio/systems/__init__.py | 1 +
threestudio/systems/sdi.py | 265 ++++++
6 files changed, 1021 insertions(+), 701 deletions(-)
create mode 100644 configs/sdi.yaml
create mode 100644 threestudio/models/guidance/stable_diffusion_sdi_guidance.py
create mode 100644 threestudio/systems/sdi.py
diff --git a/README.md b/README.md
index 19495b47..2008f253 100644
--- a/README.md
+++ b/README.md
@@ -1,208 +1,92 @@
-
-
-
-
-
-
-
-threestudio is a unified framework for 3D content creation from text prompts, single images, and few-shot images, by lifting 2D text-to-image generation models.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
+ This is the official implementation of the paper
+
+Score Distillation via Reparametrized DDIM
+
+
-
-👆 Results obtained from methods implemented by threestudio 👆
-| ProlificDreamer | DreamFusion | Magic3D | SJC | Latent-NeRF | Fantasia3D | TextMesh |
-
-| Zero-1-to-3 | Magic123 | HiFA |
-
-| InstructNeRF2NeRF | Control4D |
-
-
-
+
+
+
+
+
-
-
-
- Did not find what you want? Checkout threestudio-extension or submit a feature request here!
+
+
+
-
+ Artem Lukoianov 1,
+ Haitz Sáez de Ocáriz Borde 2,
+ Kristjan Greenewald 3,
+ Vitor Campagnolo Guizilini 4,
+ Timur Bagautdinov 5,
+ Vincent Sitzmann 1,
+ Justin Solomon 1
-
-
-
-
-
-
-
-
-
+ 1 Massachusetts Institute of Technology,
+ 2 University of Oxford,
+ 3 MIT-IBM Watson AI Lab, IBM Research,
+ 4 Toyota Research Institute,
+ 5 Meta Reality Labs Research
-
-| Animate-124 | 4D-fy | GeoDream | DreamCraft3D | Dreamwaltz | 3DFuse | Progressive3D | GaussianDreamer | Gaussian Splatting | MVDream | Mesh-Fitting |
-
-## News
-- 12/03/2024: Thank [Matthew Kwak](https://github.com/mskwak01) and [Inès Hyeonsu Kim](https://github.com/Ines-Hyeonsu-Kim) for implementation of [3DFuse](https://github.com/KU-CVLAB/3DFuse-threestudio)! Follow the instructions on its website to give it a try.
-- 08/03/2024: Thank [Xinhua Cheng](https://github.com/cxh0519/) for implementation of [GaussianDreamer](https://github.com/cxh0519/threestudio-gaussiandreamer)! Follow the instructions on its website to give it a try.
-- 01/03/2024: Thank [Xinhua Cheng](https://github.com/cxh0519/) for implementation of [Progressive3D](https://github.com/cxh0519/Progressive3D)! Follow the instructions on its website to give it a try.
-- 09/01/2024: Thank [Zehuan Huang](https://github.com/huanngzh) for implementation of 3D human avatar generation [Dreamwaltz](https://github.com/huanngzh/threestudio-dreamwaltz)! Follow the instructions on its website to give it a try.
-- 06/01/2024: Thank [Baorui Ma](https://github.com/mabaorui) for implementation of [GeoGream extensions](https://github.com/baaivision/GeoDream/tree/threestudio)! Follow the instructions on its website to give it a try.
-- 05/01/2024: Implemented HiFA. Follow the instructions [here](https://github.com/threestudio-project/threestudio#hifa-) to try all three variants.
-- 23/12/2023: Thank [Yuyang Zhao](https://github.com/HeliosZhao) for implementation of image-to-4D generation extensions [Animate-124](https://github.com/HeliosZhao/Animate124/tree/threestudio)! Follow the instructions on the extensions website to give it a try.
-- 18/12/2023: Implementation of [4D-fy](https://github.com/DSaurus/threestudio-4dfy) for 4D generation and [DreamCraft3D](https://github.com/DSaurus/threestudio-dreamcraft3D) for high-quality image-to-3D generation as the custom extensions! Follow the instructions on the extensions website to give it a try.
-- 13/12/2023: Implementation supporting [Stable Zero123](https://stability.ai/news/stable-zero123-3d-generation) for 3D generation from a single image! Follow the instructions [here](https://github.com/threestudio-project/threestudio#stable-zero123) to give it a try.
-- 30/11/2023: Implementation of [MVDream](https://github.com/DSaurus/threestudio-mvdream), [Gaussian Splatting](https://github.com/DSaurus/threestudio-3dgs) as the custom extensions. You can also use neural representation to fit a mesh by [Mesh-Fitting](https://github.com/DSaurus/threestudio-meshfitting).
-- 30/11/2023: Implementation of [custom extension system](https://threestudio-project.github.io/threestudio-extensions/) and you can add your extensions in [this project](https://github.com/threestudio-project/threestudio-extensions).
-- 25/06/2023: Implementation of [Magic123](https://guochengqian.github.io/project/magic123/)! Follow the instructions [here](https://github.com/threestudio-project/threestudio#magic123-) to give it a try.
-- 06/07/2023: Join our [Discord server](https://discord.gg/ejer2MAB8N) for lively discussions!
-- 03/07/2023: Try text-to-3D online in [HuggingFace Spaces](https://huggingface.co/spaces/bennyguo/threestudio) or using our [self-hosted service](http://t23-g-01.threestudio.ai) (GPU support from Tencent). To host the web interface locally, see [here](https://github.com/threestudio-project/threestudio#gradio-web-interface).
-- 20/06/2023: Implementations of Instruct-NeRF2NeRF and Control4D for high-fidelity 3D editing! Follow the instructions for [Control4D](https://github.com/threestudio-project/threestudio#control4d-) and [Instruct-NeRF2NeRF](https://github.com/threestudio-project/threestudio#instructnerf2nerf-) to give it a try.
-- 14/06/2023: Implementation of TextMesh! Follow the instructions [here](https://github.com/threestudio-project/threestudio#textmesh-) to give it a try.
-- 14/06/2023: Implementation of [prompt debiasing](https://arxiv.org/abs/2303.15413) and [Perp-Neg](https://perp-neg.github.io/)! Follow the instructions [here](https://github.com/threestudio-project/threestudio#tips-on-improving-quality) to give it a try.
-- 29/05/2023: An experimental implementation of using [Zero-1-to-3](https://zero123.cs.columbia.edu/) for 3D generation from a single image! Follow the instructions [here](https://github.com/threestudio-project/threestudio#zero-1-to-3-) to give it a try.
-- 26/05/2023: Implementation of [ProlificDreamer](https://ml.cs.tsinghua.edu.cn/prolificdreamer/)! Follow the instructions [here](https://github.com/threestudio-project/threestudio#prolificdreamer-) to give it a try.
-- 14/05/2023: You can experiment with the SDS loss on 2D images using our [2dplayground](2dplayground.ipynb).
-- 13/05/2023: You can now try threestudio on [Google Colab](https://colab.research.google.com/github/threestudio-project/threestudio/blob/main/threestudio.ipynb)!
-- 11/05/2023: We now support exporting textured meshes! See [here](https://github.com/threestudio-project/threestudio#export-meshes) for instructions.
-
-
-## Installation
-
-See [installation.md](docs/installation.md) for additional information, including installation via Docker.
-
-The following steps have been tested on Ubuntu20.04.
+
+ For any questions please shoot an email to arteml@mit.edu
+
-- You must have an NVIDIA graphics card with at least 6GB VRAM and have [CUDA](https://developer.nvidia.com/cuda-downloads) installed.
-- Install `Python >= 3.8`.
-- (Optional, Recommended) Create a virtual environment:
+## Prerequisites
+For this project we recommend using a UNIX server with CUDA support and a GPU with at least 40GB of VRAM.
+In the case if the amount of available VRAM is limited, we recommend reducing the rendering resolution by adding the following argument to the running command:
```sh
-python3 -m virtualenv venv
-. venv/bin/activate
-
-# Newer pip versions, e.g. pip-23.x, can be much faster than old versions, e.g. pip-20.x.
-# For instance, it caches the wheels of git packages to avoid unnecessarily rebuilding them later.
-python3 -m pip install --upgrade pip
+data.width=128 data.height=128
```
-- Install `PyTorch >= 1.12`. We have tested on `torch1.12.1+cu113` and `torch2.0.0+cu118`, but other versions should also work fine.
+Please note that this will reduce the quality of the generated shapes.
-```sh
-# torch1.12.1+cu113
-pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
-# or torch2.0.0+cu118
-pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
-```
+## Installation
-- (Optional, Recommended) Install ninja to speed up the compilation of CUDA extensions:
+This project is based on [Threestudio](https://github.com/threestudio-project/threestudio).
+Below is an example of the installation used by the authors for Ubuntu 22.04 and CUDA 12.3:
```sh
-pip install ninja
-```
+conda create -n threestudio-sdi python=3.9
+conda activate threestudio-sdi
-- Install dependencies:
+# Consult https://pytorch.org/get-started/locally/ for the latest PyTorch installation instructions
+conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
-```sh
+pip install ninja
pip install -r requirements.txt
```
-- (Optional) `tiny-cuda-nn` installation might require downgrading pip to 23.0.1
-
-- (Optional, Recommended) The best-performing models in threestudio use the newly-released T2I model [DeepFloyd IF](https://github.com/deep-floyd/IF), which currently requires signing a license agreement. If you would like to use these models, you need to [accept the license on the model card of DeepFloyd IF](https://huggingface.co/DeepFloyd/IF-I-XL-v1.0), and login into the Hugging Face hub in the terminal by `huggingface-cli login`.
-
-- For contributors, see [here](https://github.com/threestudio-project/threestudio#contributing-to-threestudio).
+For additional options please address the official installation instructions of Threestudio [here](https://github.com/threestudio-project/threestudio?tab=readme-ov-file#installation) to install threestudio.
-## Quickstart
-
-Here we show some basic usage of threestudio. First let's train a DreamFusion model to create a classic pancake bunny.
-
-**If you are experiencing unstable connections with Hugging Face, we suggest you either (1) setting environment variable `TRANSFORMERS_OFFLINE=1 DIFFUSERS_OFFLINE=1 HF_HUB_OFFLINE=1` before your running command after all needed files have been fetched on the first run, to prevent from connecting to Hugging Face each time you run, or (2) downloading the guidance model you used to a local folder following [here](https://huggingface.co/docs/huggingface_hub/v0.14.1/guides/download#download-an-entire-repository) and [here](https://huggingface.co/docs/huggingface_hub/v0.14.1/guides/download#download-files-to-local-folder), and set `pretrained_model_name_or_path` of the guidance and the prompt processor to the local path.**
+## Running generation
+The proccess of generating a shape is similar to the one described in the [threestudio](https://github.com/threestudio-project/threestudio?tab=readme-ov-file#quickstart) documentation.
+Make sure you are using the SDI config file, like below:
```sh
-# if you have agreed the license of DeepFloyd IF and have >20GB VRAM
-# please try this configuration for higher quality
-python launch.py --config configs/dreamfusion-if.yaml --train --gpu 0 system.prompt_processor.prompt="a zoomed out DSLR photo of a baby bunny sitting on top of a stack of pancakes"
-# otherwise you could try with the Stable Diffusion model, which fits in 6GB VRAM
-python launch.py --config configs/dreamfusion-sd.yaml --train --gpu 0 system.prompt_processor.prompt="a zoomed out DSLR photo of a baby bunny sitting on top of a stack of pancakes"
+python launch.py --config configs/sdi.yaml --train --gpu 0 system.prompt_processor.prompt="a zoomed out DSLR photo of a hamburger"
```
-threestudio uses [OmegaConf](https://github.com/omry/omegaconf) for flexible configurations. You can easily change any configuration in the YAML file by specifying arguments without `--`, for example the specified prompt in the above cases. For all supported configurations, please see our [documentation](https://github.com/threestudio-project/threestudio/blob/main/DOCUMENTATION.md).
-
-The training lasts for 10,000 iterations. You can find visualizations of the current status in the trial directory which defaults to `[exp_root_dir]/[name]/[tag]@[timestamp]`, where `exp_root_dir` (`outputs/` by default), `name` and `tag` can be set in the configuration file. A 360-degree video will be generated after the training is completed. In training, press `ctrl+c` one time will stop training and head directly to the test stage which generates the video. Press `ctrl+c` the second time to fully quit the program.
-
-### Multi-GPU training
-
-Multi-GPU training is supported, but may still be [buggy](https://github.com/threestudio-project/threestudio/issues/195). Note that `data.batch_size` is the batch size **per rank (device)**. Also remember to
-
-- Set `data.n_val_views` to be a multiple of the number of GPUs.
-- Set a unique `tag` as timestamp is disabled in multi-GPU training and will not be appended after the tag. If you the same tag as previous trials, saved config files, code and visualizations will be overridden.
-
-```sh
-# this results in an effective batch size of 4 (number of GPUs) * 2 (data.batch_size) = 8
-python launch.py --config configs/dreamfusion-if.yaml --train --gpu 0,1,2,3 system.prompt_processor.prompt="a zoomed out DSLR photo of a baby bunny sitting on top of a stack of pancakes" data.batch_size=2 data.n_val_views=4
-```
-
-If you define the `CUDA_VISIBLE_DEVICES` environment variable before you call `launch.py`, you don't need to specify `--gpu` - this will use all available GPUs from `CUDA_VISIBLE_DEVICES`. For instance, the following command will automatically use GPUs 3 and 4:
-
-`CUDA_VISIBLE_DEVICES=3,4 python launch.py --config configs/dreamfusion-if.yaml --train system.prompt_processor.prompt="a zoomed out DSLR photo of a baby bunny sitting on top of a stack of pancakes"`
-
-This is particularly useful if you run `launch.py` in a cluster using a command that automatically picks GPU(s) and exports their IDs through CUDA_VISIBLE_DEVICES, e.g. through SLURM:
-
-```bash
-cd git/threestudio
-. venv/bin/activate
-srun --account mod3d --partition=g40 --gpus=1 --job-name=3s_bunny python launch.py --config configs/dreamfusion-if.yaml --train system.prompt_processor.prompt="a zoomed out DSLR photo of a baby bunny sitting on top of a stack of pancakes"
-```
-
-### Resume from checkpoints
-
-If you want to resume from a checkpoint, do:
-
-```sh
-# resume training from the last checkpoint, you may replace last.ckpt with any other checkpoints
-python launch.py --config path/to/trial/dir/configs/parsed.yaml --train --gpu 0 resume=path/to/trial/dir/ckpts/last.ckpt
-# if the training has completed, you can still continue training for a longer time by setting trainer.max_steps
-python launch.py --config path/to/trial/dir/configs/parsed.yaml --train --gpu 0 resume=path/to/trial/dir/ckpts/last.ckpt trainer.max_steps=20000
-# you can also perform testing using resumed checkpoints
-python launch.py --config path/to/trial/dir/configs/parsed.yaml --test --gpu 0 resume=path/to/trial/dir/ckpts/last.ckpt
-# note that the above commands use parsed configuration files from previous trials
-# which will continue using the same trial directory
-# if you want to save to a new trial directory, replace parsed.yaml with raw.yaml in the command
-
-# only load weights from saved checkpoint but dont resume training (i.e. dont load optimizer state):
-python launch.py --config path/to/trial/dir/configs/parsed.yaml --train --gpu 0 system.weights=path/to/trial/dir/ckpts/last.ckpt
-```
+The results will be saved to `outputs/score-distillation-via-inversion/`.
### Export Meshes
-To export the scene to texture meshes, use the `--export` option. We currently support exporting to obj+mtl, or obj with vertex colors.
+To export the scene to texture meshes, use the `--export` option. Threestudio currently supports exporting to obj+mtl, or obj with vertex colors:
```sh
# this uses default mesh-exporter configurations which exports obj+mtl
@@ -222,546 +106,27 @@ For all the options you can specify when exporting, see [the documentation](http
See [here](https://github.com/threestudio-project/threestudio#supported-models) for example running commands of all our supported models. Please refer to [here](https://github.com/threestudio-project/threestudio#tips-on-improving-quality) for tips on getting higher-quality results, and [here](https://github.com/threestudio-project/threestudio#vram-optimization) for reducing VRAM usage.
-### Gradio Web Interface
-
-Launch the Gradio web interface by
-
-```
-python gradio_app.py launch
-```
-
-Parameters:
-
-- `--listen`: listens to all addresses by setting `server_name="0.0.0.0"` when launching the Gradio app.
-- `--self-deploy`: enables changing arbitrary configurations directly from the web.
-- `--save`: enables checkpoint saving.
-
-For feature requests, bug reports, or discussions about technical problems, please [file an issue](https://github.com/threestudio-project/threestudio/issues/new). In case you want to discuss the generation quality or showcase your generation results, please feel free to participate in the [discussion panel](https://github.com/threestudio-project/threestudio/discussions).
-
-## Supported Models
-
-### ProlificDreamer [](https://arxiv.org/abs/2305.16213)
-
-**This is an unofficial experimental implementation! Please refer to [https://github.com/thu-ml/prolificdreamer](https://github.com/thu-ml/prolificdreamer) for official code release.**
-
-**Results obtained by threestudio (Stable Diffusion, 256x256 Stage1)**
-
-https://github.com/threestudio-project/threestudio/assets/19284678/27b42d8f-4aa4-4b47-8ea0-0f77db90fd1e
-
-https://github.com/threestudio-project/threestudio/assets/19284678/ffcbbb01-3817-4663-a2bf-5e21a076bc3d
-
-**Results obtained by threestudio (Stable Diffusion, 256x256 Stage1, 512x512 Stage2+3)**
-
-https://github.com/threestudio-project/threestudio/assets/19284678/cfab881e-18dc-45fc-8384-7476f835b36e
-
-Notable differences from the paper:
-
-- ProlificDreamer adopts a two-stage sampling strategy with 64 coarse samples and 32 fine samples, while we only use 512 coarse samples.
-- In the first stage, we only render 64x64 images at the first 5000 iterations. After that, as the empty space has been effectively pruned, rendering 512x512 images wouldn't cost too much VRAM.
-- We currently don't support multiple particles.
-
-```sh
-# --------- Stage 1 (NeRF) --------- #
-# object generation with 512x512 NeRF rendering, ~30GB VRAM
-python launch.py --config configs/prolificdreamer.yaml --train --gpu 0 system.prompt_processor.prompt="a pineapple"
-# if you don't have enough VRAM, try training with 64x64 NeRF rendering, ~15GB VRAM
-python launch.py --config configs/prolificdreamer.yaml --train --gpu 0 system.prompt_processor.prompt="a pineapple" data.width=64 data.height=64 data.batch_size=1
-# using the same model for pretrained and LoRA enables 64x64 training with <10GB VRAM
-# but the quality is worse due to the use of an epsilon prediction model for LoRA training
-python launch.py --config configs/prolificdreamer.yaml --train --gpu 0 system.prompt_processor.prompt="a pineapple" data.width=64 data.height=64 data.batch_size=1 system.guidance.pretrained_model_name_or_path_lora="stabilityai/stable-diffusion-2-1-base"
-# Using patch-based renderer to reduce memory consume, 512x512 resolution, ~20GB VRAM
-python launch.py --config configs/prolificdreamer-patch.yaml --train --gpu 0 system.prompt_processor.prompt="a pineapple"
-# scene generation with 512x512 NeRF rendering, ~30GB VRAM
-python launch.py --config configs/prolificdreamer-scene.yaml --train --gpu 0 system.prompt_processor.prompt="Inside of a smart home, realistic detailed photo, 4k"
-
-# --------- Stage 2 (Geometry Refinement) --------- #
-# refine geometry with 512x512 rasterization, Stable Diffusion SDS guidance
-python launch.py --config configs/prolificdreamer-geometry.yaml --train --gpu 0 system.prompt_processor.prompt="a pineapple" system.geometry_convert_from=path/to/stage1/trial/dir/ckpts/last.ckpt
-
-# --------- Stage 3 (Texturing) --------- #
-# texturing with 512x512 rasterization, Stable Difusion VSD guidance
-python launch.py --config configs/prolificdreamer-texture.yaml --train --gpu 0 system.prompt_processor.prompt="a pineapple" system.geometry_convert_from=path/to/stage2/trial/dir/ckpts/last.ckpt
+### Ablations
+There are 5 main parameters in `system.guidance` reproduce the ablation results:
+```yaml
+enable_sdi: true # if true - the noise is obtained by running DDIM inversion procvess, if false - noise is sampled randomly as in SDS
+inversion_guidance_scale: -7.5 # guidance scale for DDIM inversion process
+inversion_n_steps: 10 # number of steps in the inversion process
+inversion_eta: 0.3 # random noise added to in the end of the inversion process
+t_anneal: true # if true - timestep t is annealed from 0.98 to 0.2 instead of sampled from U[0.2, 0.98] like in SDS
```
-### HiFA [](https://arxiv.org/abs/2305.18766)
-**This is a re-implementation, missing some improvements from the original paper(coarse-to-fine NeRF sampling, kernel smoothing). For original results, please refer to [https://github.com/JunzheJosephZhu/HiFA](https://github.com/JunzheJosephZhu/HiFA)**
-
-HiFA is more like a suite of improvements including image space SDS, z-variance loss, and noise strength annealing. It is compatible with most optimization-based methods. Therefore, we provide three variants based on DreamFusion, ProlificDreamer, and Magic123. We provide a unified guidance config as well as an SDS/VSD guidance config for the DreamFusion and ProlificDreamer variants, both configs should achieve the same results. Additionally, we also make HiFA compatible with ProlificDreamer-scene.
-
-**Results obtained by threestudio(Dreamfusion-HiFA, 512x512)**
-
-https://github.com/threestudio-project/threestudio/assets/24391451/c0030c66-0691-4ec2-8b79-d933101864a0
-
-**Results obtained by threestudio(ProlificDreamer-HiFA, 512x512)**
-
-https://github.com/threestudio-project/threestudio/assets/24391451/ff5dc4d0-d7d7-4a73-964e-84b8c48e2907
-
-**Results obtained by threestudio(Magic123-HiFA, 512x512)**
-
-https://github.com/threestudio-project/threestudio/assets/24391451/eb6f2f74-9143-4e26-8429-e300ad2d2b80
-
-**Example running commands**
-
-```sh
-# ------ DreamFusion-HiFA ------- # (similar to original paper)
-python launch.py --config configs/hifa.yaml --train --gpu 0 system.prompt_processor.prompt="a plate of delicious tacos"
-python launch.py --config configs/experimental/unified-guidance/hifa.yaml --train --gpu 0 system.prompt_processor.prompt="a plate of delicious tacos"
-# ------ ProlificDreamer-HiFA ------- #
-python launch.py --config configs/prolificdreamer-hifa.yaml --train --gpu 0 system.prompt_processor.prompt="a plate of delicious tacos"
-python launch.py --config configs/experimental/unified-guidance/prolificdreamer-hifa.yaml --train --gpu 0 system.prompt_processor.prompt="a plate of delicious tacos"
-# ------ ProlificDreamer-scene-HiFA ------- #
-python launch.py --config configs/prolificdreamer-scene-hifa.yaml --train --gpu 0 system.prompt_processor.prompt="A DSLR photo of a hamburger inside a restaurant"
-# ------ Magic123-HiFA ------ #
-python launch.py --config configs/magic123-hifa-coarse-sd.yaml --train --gpu 0 data.image_path=load/images/firekeeper_rgba.png system.prompt_processor.prompt="a toy figure of firekeeper from dark souls"
-# We included a config for magic123's refine stage, but didn't really run it, since the coarse stage result already looks pretty decent.
-```
-
-**Tips**
-
-- If the generated object's color seems oversaturated, decrease lambda_sds_img(or lambda_sd_img if using unified guidance).
-- If the generated object looks cloudy, increase lamda_z_variance. If the shape becomes corrupted, decrease lambda_z_variance.
-- If the generated object overall seems to have high luminance, increase min_step_percent.
-- Make sure sqrt_anneal and use_img_loss are both set to True.
-- Check out the [original repo](https://github.com/JunzheJosephZhu/HiFA)! The results are better.
-- **If you are using sqrt_anneal, make sure system.guidance.trainer_max_steps is equal to trainer.max_steps, so noise strength annealing works correctly**
-
-### DreamFusion [](https://arxiv.org/abs/2209.14988)
-
-**Results obtained by threestudio (DeepFloyd IF, batch size 8)**
-
-https://user-images.githubusercontent.com/19284678/236694848-38ae4ea4-554b-4c9d-b4c7-fba5bee3acb3.mp4
-
-**Notable differences from the paper**
-
-- We use open-source T2I models (StableDiffusion, DeepFloyd IF), while the paper uses Imagen.
-- We use a guidance scale of 20 for DeepFloyd IF, while the paper uses 100 for Imagen.
-- We do not use sigmoid to normalize the albedo color but simply scale the color from `[-1,1]` to `[0,1]`, as we find this help convergence.
-- We use HashGrid encoding and uniformly sample points along rays, while the paper uses Integrated Positional Encoding and sampling strategy from MipNeRF360.
-- We adopt camera settings and density initialization strategy from Magic3D, which is slightly different from the DreamFusion paper.
-- Some hyperparameters are different, such as the weighting of loss terms.
-
-**Example running commands**
-
-```sh
-# uses DeepFloyd IF, requires ~15GB VRAM to extract text embeddings and ~10GB VRAM in training
-# here we adopt random background augmentation to improve geometry quality
-python launch.py --config configs/dreamfusion-if.yaml --train --gpu 0 system.prompt_processor.prompt="a delicious hamburger" system.background.random_aug=true
-# uses StableDiffusion, requires ~6GB VRAM in training
-python launch.py --config configs/dreamfusion-sd.yaml --train --gpu 0 system.prompt_processor.prompt="a delicious hamburger"
-```
-
-**Tips**
-
-- DeepFloyd IF performs **way better than** StableDiffusion.
-- Validation shows albedo color before `system.material.ambient_only_steps` and shaded color after that.
-- Try increasing/decreasing `system.loss.lambda_sparsity` if your scene is stuffed with floaters/becoming empty.
-- Try increasing/decreasing `system.loss.lambda_orient` if you object is foggy/over-smoothed.
-- Try replacing the background to random colors with a probability 0.5 by setting `system.background.random_aug=true` if you find the model incorrectly treats the background as part of the object.
-- DeepFloyd IF uses T5-XXL as its text encoder, which consumes ~15GB VRAM even when using 8-bit quantization. This is currently the bottleneck for training with less VRAM. If anyone knows how to run the text encoder with less VRAM, please file an issue. We're also trying to push the text encoder to [Replicate](https://replicate.com/) to enable extracting text embeddings via API, but are having some network connection issues. Please [contact bennyguo](mailto:imbennyguo@gmail.com) if you would like to help out.
-
-### Magic3D [](https://arxiv.org/abs/2211.10440)
-
-**Results obtained by threestudio (DeepFloyd IF, batch size 8; first row: coarse, second row: refine)**
-
-https://user-images.githubusercontent.com/19284678/236694858-0ed6939e-cd7a-408f-a94b-406709ae90c0.mp4
-
-**Notable differences from the paper**
-
-- We use open-source T2I models (StableDiffusion, DeepFloyd IF) for the coarse stage, while the paper uses eDiff-I.
-- In the coarse stage, we use a guidance scale of 20 for DeepFloyd IF, while the paper uses 100 for eDiff-I.
-- In the coarse stage, we use analytic normal, while the paper uses predicted normal.
-- In the coarse stage, we use orientation loss as in DreamFusion, while the paper does not.
-- There are many things that are omitted from the paper such as the weighting of loss terms and the DMTet grid resolution, which could be different.
-
-**Example running commands**
-
-First train the coarse stage NeRF:
-
-```sh
-# uses DeepFloyd IF, requires ~15GB VRAM to extract text embeddings and ~10GB VRAM in training
-python launch.py --config configs/magic3d-coarse-if.yaml --train --gpu 0 system.prompt_processor.prompt="a delicious hamburger"
-# uses StableDiffusion, requires ~6GB VRAM in training
-python launch.py --config configs/magic3d-coarse-sd.yaml --train --gpu 0 system.prompt_processor.prompt="a delicious hamburger"
-```
-
-Then convert the NeRF from the coarse stage to DMTet and train with differentiable rasterization:
-
-```sh
-# the refinement stage uses StableDiffusion, and requires ~5GB VRAM in training
-python launch.py --config configs/magic3d-refine-sd.yaml --train --gpu 0 system.prompt_processor.prompt="a delicious hamburger" system.geometry_convert_from=path/to/coarse/stage/trial/dir/ckpts/last.ckpt
-# if you're unsatisfied with the surface extracted using the default threshold (25)
-# you can specify a threshold value using `system.geometry_convert_override`
-# decrease the value if the extracted surface is incomplete, increase if it is extruded
-python launch.py --config configs/magic3d-refine-sd.yaml --train --gpu 0 system.prompt_processor.prompt="a delicious hamburger" system.geometry_convert_from=path/to/coarse/stage/trial/dir/ckpts/last.ckpt system.geometry_convert_override.isosurface_threshold=10.
-```
-
-**Tips**
-
-- For the coarse stage, DeepFloyd IF performs **way better than** StableDiffusion.
-- Magic3D uses a neural network to predict the surface normal, which may not resemble the true geometric normal and degrade geometry quality, so we use analytic normal instead.
-- Try increasing/decreasing `system.loss.lambda_sparsity` if your scene is stuffed with floaters/becoming empty.
-- Try increasing/decreasing `system.loss.lambda_orient` if you object is foggy/over-smoothed.
-- Try replacing the background with random colors with a probability 0.5 by setting `system.background.random_aug=true` if you find the model incorrectly treats the background as part of the object.
-
-### Score Jacobian Chaining [](https://arxiv.org/abs/2212.00774)
-
-**Results obtained by threestudio (Stable Diffusion)**
-
-https://user-images.githubusercontent.com/19284678/236694871-87a247c1-2d3d-4cbf-89df-450bfeac3aca.mp4
-
-Notable differences from the paper: N/A.
-
-**Example running commands**
-
-```sh
-# train with sjc guidance in latent space
-python launch.py --config configs/sjc.yaml --train --gpu 0 system.prompt_processor.prompt="A high quality photo of a delicious burger"
-# train with sjc guidance in latent space, trump figure
-python launch.py --config configs/sjc.yaml --train --gpu 0 system.prompt_processor.prompt="Trump figure" trainer.max_steps=30000 system.loss.lambda_emptiness="[15000,10000.0,200000.0,15001]" system.optimizer.params.background.lr=0.05 seed=42
-```
-
-**Tips**
-
-- SJC uses subpixel rendering which decodes a `128x128` latent feature map for better visualization quality. You can turn off this feature by `system.subpixel_rendering=false` to save VRAM in validation/testing.
-
-### Latent-NeRF [](https://arxiv.org/abs/2211.07600)
-
-**Results obtained by threestudio (Stable Diffusion)**
-
-https://user-images.githubusercontent.com/19284678/236694876-5a270347-6a41-4429-8909-44c90c554e06.mp4
-
-Notable differences from the paper: N/A.
-
-We currently only implement Latent-NeRF for text-guided and Sketch-Shape for (text,shape)-guided 3D generation. Latent-Paint is not implemented yet.
-
-**Example running commands**
-
-```sh
-# train Latent-NeRF in Stable Diffusion latent space
-python launch.py --config configs/latentnerf.yaml --train --gpu 0 system.prompt_processor.prompt="a delicious hamburger"
-# refine Latent-NeRF in RGB space
-python launch.py --config configs/latentnerf-refine.yaml --train --gpu 0 system.prompt_processor.prompt="a delicious hamburger" system.weights=path/to/latent/stage/trial/dir/ckpts/last.ckpt
-
-# train Sketch-Shape in Stable Diffusion latent space
-python launch.py --config configs/sketchshape.yaml --train --gpu 0 system.guide_shape=load/shapes/teddy.obj system.prompt_processor.prompt="a teddy bear in a tuxedo"
-# refine Sketch-Shape in RGB space
-python launch.py --config configs/sketchshape-refine.yaml --train --gpu 0 system.guide_shape=load/shapes/teddy.obj system.prompt_processor.prompt="a teddy bear in a tuxedo" system.weights=path/to/latent/stage/trial/dir/ckpts/last.ckpt
-```
-
-### Fantasia3D [](https://arxiv.org/abs/2303.13873)
-
-**Results obtained by threestudio (Stable Diffusion)**
-
-https://user-images.githubusercontent.com/19284678/236694880-33b0db21-4530-47f1-9c3b-c70357bc84b3.mp4
-
-**Results obtained by threestudio (Stable Diffusion, mesh initialization)**
-
-https://github.com/threestudio-project/threestudio/assets/19284678/762903c1-665b-47b5-a2c2-bd7021a9e548.mp4
-
-
-
-
-
-Notable differences from the paper:
-
-- We enable tangent-space normal perturbation by default, which can be turned off by appending `system.material.use_bump=false`.
-
-**Example running commands**
-
-```sh
-# --------- Geometry --------- #
-python launch.py --config configs/fantasia3d.yaml --train --gpu 0 system.prompt_processor.prompt="a DSLR photo of an ice cream sundae"
-# Fantasia3D highly relies on the initialized SDF shape
-# the default shape is a sphere with radius 0.5
-# change the shape initialization to match your input prompt
-python launch.py --config configs/fantasia3d.yaml --train --gpu 0 system.prompt_processor.prompt="The leaning tower of Pisa" system.geometry.shape_init=ellipsoid system.geometry.shape_init_params="[0.3,0.3,0.8]"
-# or you can initialize from a mesh
-# here shape_init_params is the scale of the shape
-# also make sure to input the correct up and front axis (in +x, +y, +z, -x, -y, -z)
-python launch.py --config configs/fantasia3d.yaml --train --gpu 0 system.prompt_processor.prompt="hulk" system.geometry.shape_init=mesh:load/shapes/human.obj system.geometry.shape_init_params=0.9 system.geometry.shape_init_mesh_up=+y system.geometry.shape_init_mesh_front=+z
-# --------- Texture --------- #
-# to train PBR texture continued from a geometry checkpoint:
-python launch.py --config configs/fantasia3d-texture.yaml --train --gpu 0 system.prompt_processor.prompt="a DSLR photo of an ice cream sundae" system.geometry_convert_from=path/to/geometry/stage/trial/dir/ckpts/last.ckpt
-```
-
-**Tips**
-
-- If you find the shape easily diverge in early training stages, you may use a lower guidance scale by setting `system.guidance.guidance_scale=30.`.
-
-### TextMesh [](https://arxiv.org/abs/2304.12439)
-
-**Results obtained by threestudio (DeepFloyd IF, batch size 4)**
-
-https://github.com/threestudio-project/threestudio/assets/19284678/72217cdd-765a-475b-92d0-4ab62bf0f57a
-
-**Notable differences from the paper**
-
-- Most of the settings are the same as the DreamFusion model. Please refer to the notable differences of the DreamFusion model.
-- We use NeuS as the geometry representation while the original paper uses VolSDF.
-- We adopt techniques from [Neuralangelo](https://arxiv.org/abs/2306.03092) to stabilize normal computation when using hash grids.
-- We currently only implemented the coarse stage of TextMesh.
-
-**Example running commands**
-
-```sh
-# uses DeepFloyd IF, requires ~15GB VRAM
-python launch.py --config configs/textmesh-if.yaml --train --gpu 0 system.prompt_processor.prompt="lib:cowboy_boots"
-```
-
-**Tips**
-
-- TextMesh uses a surface-based geometry representation, so you don't need to manually tune the isosurface threshold when exporting meshes!
-
-### Control4D [](https://arxiv.org/abs/2305.20082)
-
-**This is an experimental implementation of Control4D using threestudio! Control4D will release the full code including static and dynamic editing after paper acceptance.**
-
-**Results obtained by threestudio (512x512)**
-
-https://github.com/threestudio-project/threestudio/assets/24589363/97d9aadd-32c7-488f-9543-6951b285d588
-
-We currently don't support dynamic editing.
-
-Download the data sample of control4D using this [link](https://mailstsinghuaeducn-my.sharepoint.com/:u:/g/personal/shaorz20_mails_tsinghua_edu_cn/EcqOaEuNwH1KpR0JTzL4Ur0BO_iJr8RiY2rNAGVC7h3fng?e=Dyr2gu).
-
-**Example running commands**
-
-```sh
-# --------- Control4D --------- #
-# static editing with 128x128 NeRF + 512x512 GAN rendering, ~20GB VRAM
-python launch.py --config configs/control4d-static.yaml --train --gpu 0 data.dataroot="YOUR_DATAROOT/twindom" system.prompt_processor.prompt="Elon Musk wearing red shirt, RAW photo, (high detailed skin:1.2), 8k uhd, dslr, soft lighting, high quality, film grain, Fujifilm XT3"
-```
-
-### InstructNeRF2NeRF [](https://arxiv.org/abs/2303.12789)
-
-**Results obtained by threestudio**
-
-https://github.com/threestudio-project/threestudio/assets/24589363/7aa43a2d-87d7-4ef5-94b6-f778ddb041b5
-
-Download the data sample of InstructNeRF2NeRF using this [link](https://mailstsinghuaeducn-my.sharepoint.com/:u:/g/personal/shaorz20_mails_tsinghua_edu_cn/EbNazeNAYsBIvxGeXuCmOXgBiLv8KM-hfRNbNS7DtTvSvA?e=C1k4bM).
-
-**Example running commands**
-
-```sh
-# --------- InstructNeRF2NeRF --------- #
-# 3D editing with NeRF patch-based rendering, ~20GB VRAM
-python launch.py --config configs/instructnerf2nerf.yaml --train --gpu 0 data.dataroot="YOUR_DATAROOT/face" data.camera_layout="front" data.camera_distance=1 data.eval_interpolation=[1,3,50] system.prompt_processor.prompt="Turn him into Albert Einstein"
-```
-
-### Magic123 [](https://arxiv.org/abs/2306.17843)
-
-**Results obtained by threestudio (Zero123 + Stable Diffusion)**
-
-https://github.com/threestudio-project/threestudio/assets/19284678/335a58a8-8fee-485b-ac27-c55a16f4a673
-
-**Notable differences from the paper**
-- This is an unofficial re-implementation which shares the same overall idea with the [official implementation](https://github.com/guochengqian/Magic123) but differs in some aspects like hyperparameters.
-- Textual Inversion is not supported, which means a text prompt is needed for training.
-
-**Example running commands**
-
-First train the coarse stage NeRF:
-
-```sh
-# Zero123 + Stable Diffusion, ~12GB VRAM
-# data.image_path must point to a 4-channel RGBA image
-# system.prompt_proessor.prompt must be specified
-python launch.py --config configs/magic123-coarse-sd.yaml --train --gpu 0 data.image_path=load/images/hamburger_rgba.png system.prompt_processor.prompt="a delicious hamburger"
-```
-
-Then convert the NeRF from the coarse stage to DMTet and train with differentiable rasterization:
-
-```sh
-# Zero123 + Stable Diffusion, ~10GB VRAM
-# data.image_path must point to a 4-channel RGBA image
-# system.prompt_proessor.prompt must be specified
-python launch.py --config configs/magic123-refine-sd.yaml --train --gpu 0 data.image_path=load/images/hamburger_rgba.png system.prompt_processor.prompt="a delicious hamburger" system.geometry_convert_from=path/to/coarse/stage/trial/dir/ckpts/last.ckpt
-# if you're unsatisfied with the surface extracted using the default threshold (25)
-# you can specify a threshold value using `system.geometry_convert_override`
-# decrease the value if the extracted surface is incomplete, increase if it is extruded
-python launch.py --config configs/magic123-refine-sd.yaml --train --gpu 0 data.image_path=load/images/hamburger_rgba.png system.prompt_processor.prompt="a delicious hamburger" system.geometry_convert_from=path/to/coarse/stage/trial/dir/ckpts/last.ckpt system.geometry_convert_override.isosurface_threshold=10.
-```
-
-**Tips**
-
-- If the image contains non-front-facing objects, specifying the approximate elevation and azimuth angle by setting `data.default_elevation_deg` and `data.default_azimuth_deg` can be helpful. In threestudio, top is elevation +90 and bottom is elevation -90; left is azimuth -90 and right is azimuth +90.
-
-
-### Stable Zero123
-
-**Installation**
-
-Download pretrained Stable Zero123 checkpoint `stable-zero123.ckpt` into `load/zero123` from https://huggingface.co/stabilityai/stable-zero123
-
-**Results obtained by threestudio (Stable Zero123 vs Zero123-XL)**
-
-
-**Direct multi-view images generation**
-If you only want to generate multi-view images, please refer to [threestudio-mvimg-gen](https://github.com/DSaurus/threestudio-mvimg-gen). This extension can use Stable Zero123 to directly generate images from multi-view perspectives.
-
-**Example running commands**
-
-1. Take an image of your choice, or generate it from text using your favourite AI image generator such as SDXL Turbo (https://clipdrop.co/stable-diffusion-turbo) E.g. "A simple 3D render of a friendly dog"
-2. Remove its background using Clipdrop (https://clipdrop.co/remove-background)
-3. Save to `load/images/`, preferably with `_rgba.png` as the suffix
-4. Run Zero-1-to-3 with the Stable Zero123 ckpt:
-```sh
-python launch.py --config configs/stable-zero123.yaml --train --gpu 0 data.image_path=./load/images/hamburger_rgba.png
-```
-
-**IMPORTANT NOTE: This is an experimental implementation and we're constantly improving the quality.**
-
-**IMPORTANT NOTE: This implementation extends the Zero-1-to-3 implementation below, and is heavily inspired from the Zero-1-to-3 implementation in [https://github.com/ashawkey/stable-dreamfusion](stable-dreamfusion)! `extern/ldm_zero123` is borrowed from `stable-dreamfusion/ldm`.**
-
-
-### Zero-1-to-3 [](https://arxiv.org/abs/2303.11328)
-
-**Installation**
-
-Download pretrained Zero123XL weights into `load/zero123`:
-
-```sh
-cd load/zero123
-wget https://zero123.cs.columbia.edu/assets/zero123-xl.ckpt
-```
-
-**Results obtained by threestudio (Zero-1-to-3)**
-
-
-https://github.com/threestudio-project/threestudio/assets/22424247/f4e7b66f-7a46-4f9f-8fcd-750300cef651
-
-
-**IMPORTANT NOTE: This is an experimental implementation and we're constantly improving the quality.**
-
-**IMPORTANT NOTE: This implementation is heavily inspired from the Zero-1-to-3 implementation in [https://github.com/ashawkey/stable-dreamfusion](stable-dreamfusion)! `extern/ldm_zero123` is borrowed from `stable-dreamfusion/ldm`.**
-
-**Example running commands**
-
-1. Take an image of your choice, or generate it from text using your favourite AI image generator such as Stable Diffusion XL (https://clipdrop.co/stable-diffusion) E.g. "A simple 3D render of a friendly dog"
-2. Remove its background using Clipdrop (https://clipdrop.co/remove-background)
-3. Save to `load/images/`, preferably with `_rgba.png` as the suffix
-4. Run Zero-1-to-3:
-```sh
-python launch.py --config configs/zero123.yaml --train --gpu 0 data.image_path=./load/images/dog1_rgba.png
-```
-
-For more scripts for Zero-1-to-3, please check `threestudio/scripts/run_zero123.sh`.
-
-Previous Zero-1-to-3 weights are available at `https://huggingface.co/cvlab/zero123-weights/`. You can download them to `load/zero123` as above, and replace the path at `system.guidance.pretrained_model_name_or_path`.
-
-**Guidance evaluation**
-
-Also includes evaluation of the guidance during training. If `system.freq.guidance_eval` is set to a value > 0, this will save rendered image, noisy image (noise added mentioned at top left), 1-step-denoised image, 1-step prediction of original image, fully denoised image. For example:
-
-
-
-### More to come, please stay tuned.
-
-- [ ] [Dream3D](https://bluestyle97.github.io/dream3d/) [](https://arxiv.org/abs/2212.14704)
-- [ ] [DreamAvatar](https://yukangcao.github.io/DreamAvatar/) [](https://arxiv.org/abs/2304.00916)
-
-**If you would like to contribute a new method to threestudio, see [here](https://github.com/threestudio-project/threestudio#contributing-to-threestudio).**
-
-## Prompt Library
-
-For easier comparison, we collect the 397 preset prompts from the website of [DreamFusion](https://dreamfusion3d.github.io/gallery.html) in [this file](https://github.com/threestudio-project/threestudio/blob/main/load/prompt_library.json). You can use these prompts by setting `system.prompt_processor.prompt=lib:keyword1_keyword2_..._keywordN`. Note that the prompt should starts with `lib:` and all the keywords are separated by `_`. The prompt processor will match the keywords to all the prompts in the library, and will only succeed if there's **exactly one match**. The used prompt will be printed to the console. Also note that you can't use this syntax to point to every prompt in the library, as there are prompts that are subset of other prompts lmao. We will enhance the use of this feature.
-
-## Tips on Improving Quality
-
-It's important to note that existing techniques that lift 2D T2I models to 3D cannot consistently produce satisfying results. Results from great papers like DreamFusion and Magic3D are (to some extent) cherry-pickled, so don't be frustrated if you do not get what you expected on your first trial. Here are some tips that may help you improve the generation quality:
-
-- **Increase batch size**. Large batch sizes help convergence and improve the 3D consistency of the geometry. State-of-the-art methods claim using large batch sizes: DreamFusion uses a batch size of 4; Magic3D uses a batch size of 32; Fantasia3D uses a batch size of 24; some results shown above use a batch size of 8. You can easily change the batch size by setting `data.batch_size=N`. Increasing the batch size requires more VRAM. If you have limited VRAM but still want the benefit of large batch sizes, you may use [gradient accumulation provided by PyTorch Lightning](https://lightning.ai/docs/pytorch/stable/advanced/training_tricks.html#accumulate-gradients) by setting `trainer.accumulate_grad_batches=N`. This will accumulate the gradient of several batches and achieve a large effective batch size. Note that if you use gradient accumulation, you may need to multiply all step values by N times in your config, such as values that have the name `X_steps` and `trainer.val_check_interval`, since now N batches equal to a large batch.
-- **Train longer.** This helps if you can already obtain reasonable results and would like to enhance the details. If the result is still a mess after several thousand steps, training for a longer time often won't help. You can set the total training iterations by `trainer.max_steps=N`.
-- **Try different seeds.** This is a simple solution if your results have correct overall geometry but suffer from the multi-face Janus problem. You can change the seed by setting `seed=N`. Good luck!
-- **Tuning regularization weights.** Some methods have regularization terms which can be essential to obtaining good geometry. Try tuning the weights of these regularizations by setting `system.loss.lambda_X=value`. The specific values depend on your situation, you may refer to [tips for each supported model](https://github.com/threestudio-project/threestudio#supported-models) for more detailed instructions.
-- **Try debiasing methods.** When conventional SDS techniques like DreamFusion, Magic3D, SJC, and others fail to produce the desired 3D results, Debiased Score Distillation Sampling (D-SDS) can be a solution. D-SDS is devised to tackle challenges such as artifacts or the Janus problem, employing two strategies: score debiasing and prompt debiasing. You can activate score debiasing by just setting `system.guidance.grad_clip=[0,0.5,2.0,10000]`, where the order is `start_step, start_value, end_value, end_step`. You can enable prompt debiasing by setting `system.prompt_processor.use_prompt_debiasing=true`. When using prompt debiasing, it's recommended to set a list of indices for words that should potentially be removed by `system.prompt_processor.prompt_debiasing_mask_ids=[i1,i2,...]`. For example, if the prompt is `a smiling dog` and you only want to remove the word `smiling` for certain views, you should set it to `[1]`. You could also manually specify the prompt for each view by setting `system.prompt_processor.prompt_side`, `system.prompt_processor.prompt_back` and `system.prompt_processor.prompt_overhead`. For a detailed explanation of these techniques, refer to [the D-SDS paper](https://arxiv.org/abs/2303.15413) or check out [the project page](https://susunghong.github.io/Debiased-Score-Distillation-Sampling/).
-- **Try Perp-Neg.** The [Perp-Neg algorithm](https://perp-neg.github.io/) can potentially alleviate the multi-face Janus problem. We now support Perp-Neg for `stable-diffusion-guidance` and `deep-floyd-guidance` by setting `system.prompt_processor.use_perp_neg=true`.
-
-## VRAM Optimization
-
-If you encounter CUDA OOM error, try the following in order (roughly sorted by recommendation) to meet your VRAM requirement.
-
-- If you only encounter OOM at validation/test time, you can set `system.cleanup_after_validation_step=true` and `system.cleanup_after_test_step=true` to free memory after each validation/test step. This will slow down validation/testing.
-- Use a smaller batch size or use gradient accumulation as demonstrated [here](https://github.com/threestudio-project/threestudio#tips-on-improving-quality).
-- If you are using PyTorch1.x, enable [memory efficient attention](https://huggingface.co/docs/diffusers/optimization/fp16#memory-efficient-attention) by setting `system.guidance.enable_memory_efficient_attention=true`. PyTorch2.0 has built-in support for this optimization and is enabled by default.
-- Enable [attention slicing](https://huggingface.co/docs/diffusers/optimization/fp16#sliced-attention-for-additional-memory-savings) by setting `system.guidance.enable_attention_slicing=true`. This will slow down training by ~20%.
-- If you are using StableDiffusionGuidance, you can use [Token Merging](https://github.com/dbolya/tomesd) to **drastically** speed up computation and save memory. You can easily enable Token Merging by setting `system.guidance.token_merging=true`. You can also customize the Token Merging behavior by setting the parameters [here](https://github.com/dbolya/tomesd/blob/main/tomesd/patch.py#L183-L213) to `system.guidance.token_merging_params`. Note that Token Merging may degrade generation quality.
-- Enable [sequential CPU offload](https://huggingface.co/docs/diffusers/optimization/fp16#offloading-to-cpu-with-accelerate-for-memory-savings) by setting `system.guidance.enable_sequential_cpu_offload=true`. This could save a lot of VRAM but will make the training **extremely slow**.
-
-## Documentation
-
-threestudio use [OmegaConf](https://github.com/omry/omegaconf) to manage configurations. You can literally change anything inside the yaml configuration file or by adding command line arguments without `--`. We list all arguments that you can change in the configuration in our [documentation](https://github.com/threestudio-project/threestudio/blob/main/DOCUMENTATION.md). Happy experimenting!
-
-## wandb (Weights & Biases) logging
-
-To enable the (experimental) wandb support, set `system.loggers.wandb.enable=true`, e.g.:
-
-```bash
-python launch.py --config configs/zero123.yaml --train --gpu 0 system.loggers.wandb.enable=true`
-```
-
-If you're using a corporate wandb server, you may first need to login to your wandb instance, e.g.:
-`wandb login --host=https://COMPANY_XYZ.wandb.io --relogin`
-
-By default the runs will have a random name, recorded in the `threestudio` project. You can override them to give a more descriptive name, e.g.:
-
-`python launch.py --config configs/zero123.yaml --train --gpu 0 system.loggers.wandb.enable=true system.loggers.wandb.name="zero123xl_accum;bs=4;lr=0.05"`
-
-## Contributing to threestudio
-
-- Fork the repository and create your branch from `main`.
-- Install development dependencies:
-
-```sh
-pip install -r requirements-dev.txt
-```
-
-- If you are using VSCode as the text editor: (1) Install `editorconfig` extension. (2) Set the default linter to mypy to enable static type checking. (3) Set the default formatter to black. You could either manually format the document or let the editor format the document each time it is saved by setting `"editor.formatOnSave": true`.
-
-- Run `pre-commit install` to install pre-commit hooks which will automatically format the files before commit.
-
-- Make changes to the code, update README and DOCUMENTATION if needed, and open a pull request.
-
-### Code Structure
-
-Here we just briefly introduce the code structure of this project. We will make more detailed documentation about this in the future.
-
-- All methods are implemented as a subclass of `BaseSystem` (in `systems/base.py`). There typically are six modules inside a system: geometry, material, background, renderer, guidance, and prompt_processor. All modules are subclass of `BaseModule` (in `utils/base.py`) except for guidance, and prompt_processor, which are subclass of `BaseObject` to prevent them from being treated as model parameters and better control their behavior in multi-GPU settings.
-- All systems, modules, and data modules have their configurations in their own dataclasses.
-- Base configurations for the whole project can be found in `utils/config.py`. In the `ExperimentConfig` dataclass, `data`, `system`, and module configurations under `system` are parsed to configurations of each class mentioned above. These configurations are strictly typed, which means you can only use defined properties in the dataclass and stick to the defined type of each property. This configuration paradigm (1) naturally supports default values for properties; (2) effectively prevents wrong assignments of these properties (say typos in the yaml file) or inappropriate usage at runtime.
-- This projects use both static and runtime type checking. For more details, see `utils/typing.py`.
-- To update anything of a module at each training step, simply make it inherit to `Updateable` (see `utils/base.py`). At the beginning of each iteration, an `Updateable` will update itself, and update all its attributes that are also `Updateable`. Note that subclasses of `BaseSystem`, `BaseModule` and `BaseObject` are by default inherited to `Updateable`.
-
-## Known Problems
-
-- Gradients of Vanilla MLP parameters are empty in AMP (temporarily fixed by disabling autocast).
-- FullyFused MLP may cause NaNs in 32 precision.
-
-## Credits
-
-threestudio is built on the following amazing open-source projects:
-
-- **[Lightning](https://github.com/Lightning-AI/lightning)** Framework for creating highly organized PyTorch code.
-- **[OmegaConf](https://github.com/omry/omegaconf)** Flexible Python configuration system.
-- **[NerfAcc](https://github.com/KAIR-BAIR/nerfacc)** Plug-and-play NeRF acceleration.
-
-The following repositories greatly inspire threestudio:
-
-- **[Stable-DreamFusion](https://github.com/ashawkey/stable-dreamfusion)**
-- **[Latent-NeRF](https://github.com/eladrich/latent-nerf)**
-- **[Score Jacobian Chaining](https://github.com/pals-ttic/sjc)**
-- **[Fantasia3D.unofficial](https://github.com/ashawkey/fantasia3d.unofficial)**
-
-Thanks to the maintainers of these projects for their contribution to the community!
-## Citing threestudio
+## Citing
-If you find threestudio helpful, please consider citing:
+If you find our project useful, please consider citing it:
```
-@Misc{threestudio2023,
- author = {Yuan-Chen Guo and Ying-Tian Liu and Ruizhi Shao and Christian Laforte and Vikram Voleti and Guan Luo and Chia-Hao Chen and Zi-Xin Zou and Chen Wang and Yan-Pei Cao and Song-Hai Zhang},
- title = {threestudio: A unified framework for 3D content generation},
- howpublished = {\url{https://github.com/threestudio-project/threestudio}},
- year = {2023}
+@misc{lukoianov2024score,
+ title={Score Distillation via Reparametrized DDIM},
+ author={Artem Lukoianov and Haitz Sáez de Ocáriz Borde and Kristjan Greenewald and Vitor Campagnolo Guizilini and Timur Bagautdinov and Vincent Sitzmann and Justin Solomon},
+ year={2024},
+ eprint={2405.15891},
+ archivePrefix={arXiv},
+ primaryClass={cs.CV}
}
```
diff --git a/configs/sdi.yaml b/configs/sdi.yaml
new file mode 100644
index 00000000..ddd98683
--- /dev/null
+++ b/configs/sdi.yaml
@@ -0,0 +1,123 @@
+name: "score-distillation-via-inversion" # https://arxiv.org/abs/2405.15891
+tag: "${rmspace:${system.prompt_processor.prompt},_}"
+exp_root_dir: "outputs"
+seed: 0
+
+data_type: "random-camera-datamodule"
+data:
+ batch_size: 1
+ width: 512
+ height: 512
+ camera_distance_range: [1.5, 2.0]
+ fovy_range: [40, 70]
+ elevation_range: [-10, 45]
+ light_sample_strategy: "dreamfusion"
+ eval_camera_distance: 2.0
+ eval_fovy_deg: 70.
+
+system_type: "sdi-system"
+system:
+ geometry_type: "implicit-volume"
+ geometry:
+ radius: 2.0
+ normal_type: "analytic"
+
+ # use Magic3D density initialization
+ density_bias: "blob_magic3d"
+ density_activation: softplus
+ density_blob_scale: 10.
+ density_blob_std: 0.5
+
+ # coarse to fine hash grid encoding
+ # to ensure smooth analytic normals
+ pos_encoding_config:
+ otype: ProgressiveBandHashGrid
+ n_levels: 16
+ n_features_per_level: 2
+ log2_hashmap_size: 19
+ base_resolution: 16
+ per_level_scale: 1.447269237440378 # max resolution 4096
+ start_level: 8 # resolution ~200
+ start_step: 2000
+ update_steps: 500
+
+ material_type: "diffuse-with-point-light-material"
+ material:
+ ambient_only_steps: 10001
+ albedo_activation: sigmoid
+ diffuse_prob: 0.4 #0.75
+ textureless_prob: 0.5
+
+ background_type: "neural-environment-map-background"
+ background:
+ color_activation: sigmoid
+
+ renderer_type: "nerf-volume-renderer"
+ renderer:
+ radius: ${system.geometry.radius}
+ num_samples_per_ray: 512
+
+ prompt_processor_type: "stable-diffusion-prompt-processor"
+ prompt_processor:
+ pretrained_model_name_or_path: "stabilityai/stable-diffusion-2-1-base"
+ prompt: ???
+ use_perp_neg: true
+
+ guidance_type: "stable-diffusion-sdi-guidance"
+ guidance:
+ pretrained_model_name_or_path: "stabilityai/stable-diffusion-2-1-base"
+ guidance_scale: 7.5
+ weighting_strategy: sds
+ min_step_percent: 0.2
+ max_step_percent: 0.98
+
+ # SDI parameters
+ enable_sdi: true
+ inversion_guidance_scale: -7.5
+ inversion_n_steps: 10
+ inversion_eta: 0.3
+ t_anneal: true
+
+ loggers:
+ wandb:
+ enable: false
+ project: "threestudio"
+ name: None
+
+ loss:
+ # lambda_sds: 1.
+ # lambda_orient: [0, 10., 1000., 5000]
+ # lambda_sparsity: 1.
+ # lambda_opaque: 0.
+ lambda_sdi: 1.
+ lambda_orient: 0.1
+ lambda_sparsity: [0,0.15,0.,3000]
+ lambda_opaque: 0.1
+ # lambda_z_variance: 1.
+ # lambda_convex: [0,1.,0.1,4000]
+ # lambda_convex_hess: 0.
+
+ optimizer:
+ name: Adam
+ args:
+ lr: 0.01
+ betas: [0.9, 0.99]
+ eps: 1.e-15
+ params:
+ geometry:
+ lr: 0.01
+ background:
+ lr: 0.001
+
+trainer:
+ max_steps: 10000
+ log_every_n_steps: 1
+ num_sanity_val_steps: 0
+ val_check_interval: 50
+ enable_progress_bar: true
+ precision: 16-mixed
+
+checkpoint:
+ save_last: true # save at each validation time
+ save_top_k: -1
+ every_n_train_steps: ${trainer.max_steps}
diff --git a/threestudio/models/guidance/__init__.py b/threestudio/models/guidance/__init__.py
index b25a8d76..3fa96ae5 100644
--- a/threestudio/models/guidance/__init__.py
+++ b/threestudio/models/guidance/__init__.py
@@ -8,4 +8,5 @@
stable_zero123_guidance,
zero123_guidance,
zero123_unified_guidance,
+ stable_diffusion_sdi_guidance,
)
diff --git a/threestudio/models/guidance/stable_diffusion_sdi_guidance.py b/threestudio/models/guidance/stable_diffusion_sdi_guidance.py
new file mode 100644
index 00000000..9e55c3a7
--- /dev/null
+++ b/threestudio/models/guidance/stable_diffusion_sdi_guidance.py
@@ -0,0 +1,565 @@
+from dataclasses import dataclass, field
+from dataclasses import dataclass, field
+
+import torch
+import torch.nn as nn
+import torch.nn.functional as F
+from diffusers import DDIMScheduler, DDIMInverseScheduler, StableDiffusionPipeline
+from diffusers.utils.import_utils import is_xformers_available
+from tqdm import tqdm
+
+import threestudio
+from threestudio.models.prompt_processors.base import PromptProcessorOutput
+from threestudio.utils.base import BaseObject
+from threestudio.utils.misc import C, cleanup, parse_version
+from threestudio.utils.ops import perpendicular_component
+from threestudio.utils.typing import *
+
+
+@threestudio.register("stable-diffusion-sdi-guidance")
+class StableDiffusionSDIGuidance(BaseObject):
+ @dataclass
+ class Config(BaseObject.Config):
+ pretrained_model_name_or_path: str = "runwayml/stable-diffusion-v1-5"
+ enable_memory_efficient_attention: bool = False
+ enable_sequential_cpu_offload: bool = False
+ enable_attention_slicing: bool = False
+ enable_channels_last_format: bool = False
+ guidance_scale: float = 100.0
+ grad_clip: Optional[
+ Any
+ ] = None # field(default_factory=lambda: [0, 2.0, 8.0, 1000])
+ half_precision_weights: bool = True
+
+ min_step_percent: float = 0.02
+ max_step_percent: float = 0.98
+ trainer_max_steps: int = 10000
+ use_img_loss: bool = False # image-space SDS proposed in HiFA: https://hifa-team.github.io/HiFA-site/
+
+ var_red: bool = True
+ weighting_strategy: str = "sds"
+
+ token_merging: bool = False
+ token_merging_params: Optional[dict] = field(default_factory=dict)
+
+ view_dependent_prompting: bool = True
+
+ """Maximum number of batch items to evaluate guidance for (for debugging) and to save on disk. -1 means save all items."""
+ max_items_eval: int = 4
+
+ n_ddim_steps: int = 50
+
+ # SDI parameters https://arxiv.org/abs/2405.15891
+ enable_sdi: bool = True # if false, sample noise randomly like in SDS
+ inversion_guidance_scale: float = -7.5
+ inversion_n_steps: int = 10
+ inversion_eta: float = 0.3
+ t_anneal: bool = True
+
+ cfg: Config
+
+ def configure(self) -> None:
+ threestudio.info(f"Loading Stable Diffusion ...")
+
+ self.weights_dtype = (
+ torch.float16 if self.cfg.half_precision_weights else torch.float32
+ )
+
+ pipe_kwargs = {
+ "tokenizer": None,
+ "safety_checker": None,
+ "feature_extractor": None,
+ "requires_safety_checker": False,
+ "torch_dtype": self.weights_dtype,
+ }
+ self.pipe = StableDiffusionPipeline.from_pretrained(
+ self.cfg.pretrained_model_name_or_path,
+ **pipe_kwargs,
+ ).to(self.device)
+
+ if self.cfg.enable_memory_efficient_attention:
+ if parse_version(torch.__version__) >= parse_version("2"):
+ threestudio.info(
+ "PyTorch2.0 uses memory efficient attention by default."
+ )
+ elif not is_xformers_available():
+ threestudio.warn(
+ "xformers is not available, memory efficient attention is not enabled."
+ )
+ else:
+ self.pipe.enable_xformers_memory_efficient_attention()
+
+ if self.cfg.enable_sequential_cpu_offload:
+ self.pipe.enable_sequential_cpu_offload()
+
+ if self.cfg.enable_attention_slicing:
+ self.pipe.enable_attention_slicing(1)
+
+ if self.cfg.enable_channels_last_format:
+ self.pipe.unet.to(memory_format=torch.channels_last)
+
+ del self.pipe.text_encoder
+ cleanup()
+
+ # Create model
+ self.vae = self.pipe.vae.eval()
+ self.unet = self.pipe.unet.eval()
+
+ for p in self.vae.parameters():
+ p.requires_grad_(False)
+ for p in self.unet.parameters():
+ p.requires_grad_(False)
+
+ if self.cfg.token_merging:
+ import tomesd
+
+ tomesd.apply_patch(self.unet, **self.cfg.token_merging_params)
+
+ self.scheduler = DDIMScheduler.from_pretrained(
+ self.cfg.pretrained_model_name_or_path,
+ subfolder="scheduler",
+ torch_dtype=self.weights_dtype,
+ )
+ self.scheduler.alphas_cumprod = self.scheduler.alphas_cumprod.to(device=self.device)
+ self.scheduler.set_timesteps(self.cfg.n_ddim_steps, device=self.device)
+
+ self.inverse_scheduler = DDIMInverseScheduler.from_pretrained(
+ self.cfg.pretrained_model_name_or_path,
+ subfolder="scheduler",
+ torch_dtype=self.weights_dtype,
+ )
+ self.inverse_scheduler.set_timesteps(self.cfg.inversion_n_steps, device=self.device)
+ self.inverse_scheduler.alphas_cumprod = self.inverse_scheduler.alphas_cumprod.to(device=self.device)
+
+ self.num_train_timesteps = self.scheduler.config.num_train_timesteps
+ self.set_min_max_steps() # set to default value
+
+ self.alphas: Float[Tensor, "..."] = self.scheduler.alphas_cumprod.to(
+ self.device
+ )
+ self.grad_clip_val: Optional[float] = None
+
+ threestudio.info(f"Loaded Stable Diffusion!")
+
+ @torch.cuda.amp.autocast(enabled=False)
+ def set_min_max_steps(self, min_step_percent=0.02, max_step_percent=0.98):
+ self.min_step = int(self.num_train_timesteps * min_step_percent)
+ self.max_step = int(self.num_train_timesteps * max_step_percent)
+
+ @torch.cuda.amp.autocast(enabled=False)
+ def forward_unet(
+ self,
+ latents: Float[Tensor, "..."],
+ t: Float[Tensor, "..."],
+ encoder_hidden_states: Float[Tensor, "..."],
+ ) -> Float[Tensor, "..."]:
+ input_dtype = latents.dtype
+ return self.unet(
+ latents.to(self.weights_dtype),
+ t.to(self.weights_dtype),
+ encoder_hidden_states=encoder_hidden_states.to(self.weights_dtype),
+ ).sample.to(input_dtype)
+
+ @torch.cuda.amp.autocast(enabled=False)
+ def encode_images(
+ self, imgs: Float[Tensor, "B 3 512 512"]
+ ) -> Float[Tensor, "B 4 64 64"]:
+ input_dtype = imgs.dtype
+ imgs = imgs * 2.0 - 1.0
+ posterior = self.vae.encode(imgs.to(self.weights_dtype)).latent_dist
+ latents = posterior.sample() * self.vae.config.scaling_factor
+ return latents.to(input_dtype)
+
+ @torch.cuda.amp.autocast(enabled=False)
+ def decode_latents(
+ self,
+ latents: Float[Tensor, "B 4 H W"],
+ latent_height: int = 64,
+ latent_width: int = 64,
+ ) -> Float[Tensor, "B 3 512 512"]:
+ input_dtype = latents.dtype
+ latents = F.interpolate(
+ latents, (latent_height, latent_width), mode="bilinear", align_corners=False
+ )
+ latents = 1 / self.vae.config.scaling_factor * latents
+ image = self.vae.decode(latents.to(self.weights_dtype)).sample
+ image = (image * 0.5 + 0.5).clamp(0, 1)
+ return image.to(input_dtype)
+
+ @torch.no_grad()
+ def predict_noise(
+ self,
+ latents_noisy: Float[Tensor, "B 4 64 64"],
+ t: Int[Tensor, "B"],
+ prompt_utils: PromptProcessorOutput,
+ elevation: Float[Tensor, "B"],
+ azimuth: Float[Tensor, "B"],
+ camera_distances: Float[Tensor, "B"],
+ guidance_scale: float = 1.0,
+ ):
+
+ batch_size = elevation.shape[0]
+
+ if prompt_utils.use_perp_neg:
+ (
+ text_embeddings,
+ neg_guidance_weights,
+ ) = prompt_utils.get_text_embeddings_perp_neg(
+ elevation, azimuth, camera_distances, self.cfg.view_dependent_prompting
+ )
+
+ latent_model_input = torch.cat([latents_noisy] * 4, dim=0)
+ noise_pred = self.forward_unet(
+ latent_model_input,
+ torch.cat([t] * 4),
+ encoder_hidden_states=text_embeddings,
+ ) # (4B, 3, 64, 64)
+
+ noise_pred_text = noise_pred[:batch_size]
+ noise_pred_uncond = noise_pred[batch_size : batch_size * 2]
+ noise_pred_neg = noise_pred[batch_size * 2 :]
+
+ e_pos = noise_pred_text - noise_pred_uncond
+ accum_grad = 0
+ n_negative_prompts = neg_guidance_weights.shape[-1]
+ for i in range(n_negative_prompts):
+ e_i_neg = noise_pred_neg[i::n_negative_prompts] - noise_pred_uncond
+ accum_grad += neg_guidance_weights[:, i].view(
+ -1, 1, 1, 1
+ ) * perpendicular_component(e_i_neg, e_pos)
+
+ noise_pred = noise_pred_uncond + guidance_scale * (
+ e_pos + accum_grad
+ )
+ else:
+ neg_guidance_weights = None
+ text_embeddings = prompt_utils.get_text_embeddings(
+ elevation, azimuth, camera_distances, self.cfg.view_dependent_prompting
+ )
+ # predict the noise residual with unet, NO grad!
+ with torch.no_grad():
+ # pred noise
+ latent_model_input = torch.cat([latents_noisy] * 2, dim=0)
+ noise_pred = self.forward_unet(
+ latent_model_input,
+ torch.cat([t] * 2),
+ encoder_hidden_states=text_embeddings,
+ )
+
+ # perform guidance (high scale from paper!)
+ noise_pred_text, noise_pred_uncond = noise_pred.chunk(2)
+ noise_pred = noise_pred_text + guidance_scale * (
+ noise_pred_text - noise_pred_uncond
+ )
+
+ return noise_pred, neg_guidance_weights, text_embeddings
+
+ @torch.no_grad()
+ def invert_noise(self, start_latents, invert_to_t, prompt_utils, elevation, azimuth, camera_distances):
+ latents = start_latents.clone()
+ B = start_latents.shape[0]
+
+ timesteps = self.inverse_scheduler.timesteps[self.inverse_scheduler.timesteps < invert_to_t]
+
+ inversion_eta = self.cfg.inversion_eta
+ for t in timesteps:
+ noise_pred, _, _ = self.predict_noise(latents, t.repeat([B]), prompt_utils, elevation, azimuth, camera_distances,
+ guidance_scale=self.cfg.inversion_guidance_scale)
+ latents = self.inverse_scheduler.step(noise_pred, t, latents).prev_sample
+
+ prev_t = t + self.inverse_scheduler.config.num_train_timesteps // self.inverse_scheduler.num_inference_steps
+ # prev_t = prev_t.clamp(0, self.inverse_scheduler.config.num_train_timesteps - 1)
+ variance = self.scheduler._get_variance(prev_t, t) ** (0.5)
+ latents += inversion_eta * torch.randn_like(latents) * variance
+ return latents
+
+ def get_noise_from_target(self, target, cur_xt, t):
+ alpha_prod_t = self.scheduler.alphas_cumprod[t]
+ beta_prod_t = 1 - alpha_prod_t
+ noise = (cur_xt - target * alpha_prod_t ** (0.5)) / (beta_prod_t ** (0.5))
+ return noise
+
+ def get_x0(self, original_samples, noise_pred, t):
+ step_results = self.scheduler.step(noise_pred, t[0], original_samples, return_dict=True)
+ if "pred_original_sample" in step_results:
+ return step_results["pred_original_sample"]
+ elif "denoised" in step_results:
+ return step_results["denoised"]
+ raise ValueError("Looks like the scheduler does not compute x0")
+
+ @torch.no_grad()
+ def compute_grad_sdi(
+ self,
+ latents: Float[Tensor, "B 4 64 64"],
+ image: Float[Tensor, "B 3 512 512"],
+ t: Int[Tensor, "B"],
+ prompt_utils: PromptProcessorOutput,
+ elevation: Float[Tensor, "B"],
+ azimuth: Float[Tensor, "B"],
+ camera_distances: Float[Tensor, "B"],
+ ):
+
+ if self.cfg.enable_sdi:
+ latents_noisy = self.invert_noise(latents, t, prompt_utils, elevation, azimuth, camera_distances)
+ noise = self.get_noise_from_target(latents, latents_noisy, t)
+ else:
+ noise = torch.randn_like(latents)
+ latents_noisy = self.scheduler.add_noise(latents, noise, t)
+
+ noise_pred, neg_guidance_weights, text_embeddings = self.predict_noise(
+ latents_noisy,
+ t,
+ prompt_utils,
+ elevation,
+ azimuth,
+ camera_distances,
+ guidance_scale=self.cfg.guidance_scale
+ )
+
+ latents_denoised = self.get_x0(latents_noisy, noise_pred, t).detach() # (latents_noisy - sigma * noise_pred) / alpha
+
+ guidance_eval_utils = {
+ "use_perp_neg": prompt_utils.use_perp_neg,
+ "neg_guidance_weights": neg_guidance_weights,
+ "text_embeddings": text_embeddings,
+ "t_orig": t,
+ "latents_noisy": latents_noisy,
+ "noise_pred": noise_pred,
+ }
+
+ return latents_denoised, latents_noisy, guidance_eval_utils
+
+ def __call__(
+ self,
+ rgb: Float[Tensor, "B H W C"],
+ prompt_utils: PromptProcessorOutput,
+ elevation: Float[Tensor, "B"],
+ azimuth: Float[Tensor, "B"],
+ camera_distances: Float[Tensor, "B"],
+ rgb_as_latents=False,
+ guidance_eval=False,
+ test_call=False,
+ **kwargs,
+ ):
+ batch_size = rgb.shape[0]
+
+ rgb_BCHW = rgb.permute(0, 3, 1, 2)
+ latents: Float[Tensor, "B 4 64 64"]
+ rgb_BCHW_512 = F.interpolate(
+ rgb_BCHW, (512, 512), mode="bilinear", align_corners=False
+ )
+ if rgb_as_latents:
+ latents = F.interpolate(
+ rgb_BCHW, (64, 64), mode="bilinear", align_corners=False
+ )
+ else:
+ # encode image into latents with vae
+ latents = self.encode_images(rgb_BCHW_512)
+
+ # timestep ~ U(0.02, 0.98) to avoid very high/low noise level
+ t = torch.randint(
+ self.min_step,
+ self.max_step + 1,
+ [batch_size],
+ dtype=torch.long,
+ device=self.device,
+ )
+
+ target, noisy_img, guidance_eval_utils = self.compute_grad_sdi(
+ latents,
+ rgb_BCHW_512,
+ t,
+ prompt_utils,
+ elevation,
+ azimuth,
+ camera_distances,
+ )
+
+ if test_call:
+ return target, noisy_img
+
+ # loss = SpecifyGradient.apply(latents, grad)
+ # SpecifyGradient is not straghtforward, use a reparameterization trick instead
+ # target = (latents - grad).detach()
+ # d(loss)/d(latents) = latents - target = latents - (latents - grad) = grad
+ loss_sdi = 0.5 * F.mse_loss(latents, target.detach(), reduction="mean") / batch_size
+
+ guidance_out = {
+ "loss_sdi": loss_sdi,
+ "grad_norm": (latents - target).norm(),
+ "min_step": self.min_step,
+ "max_step": self.max_step,
+ }
+
+ if guidance_eval:
+ guidance_eval_out = self.guidance_eval(**guidance_eval_utils)
+ texts = []
+ for n, e, a, c in zip(
+ guidance_eval_out["noise_levels"], elevation, azimuth, camera_distances
+ ):
+ texts.append(
+ f"n{n:.02f}\ne{e.item():.01f}\na{a.item():.01f}\nc{c.item():.02f}"
+ )
+ guidance_eval_out.update({"texts": texts})
+ guidance_out.update({"eval": guidance_eval_out})
+
+ return guidance_out
+
+ @torch.cuda.amp.autocast(enabled=False)
+ @torch.no_grad()
+ def get_noise_pred(
+ self,
+ latents_noisy,
+ t,
+ text_embeddings,
+ use_perp_neg=False,
+ neg_guidance_weights=None,
+ ):
+ batch_size = latents_noisy.shape[0]
+
+ if use_perp_neg:
+ # pred noise
+ latent_model_input = torch.cat([latents_noisy] * 4, dim=0)
+ noise_pred = self.forward_unet(
+ latent_model_input,
+ torch.cat([t.reshape(1)] * 4).to(self.device),
+ encoder_hidden_states=text_embeddings,
+ ) # (4B, 3, 64, 64)
+
+ noise_pred_text = noise_pred[:batch_size]
+ noise_pred_uncond = noise_pred[batch_size : batch_size * 2]
+ noise_pred_neg = noise_pred[batch_size * 2 :]
+
+ e_pos = noise_pred_text - noise_pred_uncond
+ accum_grad = 0
+ n_negative_prompts = neg_guidance_weights.shape[-1]
+ for i in range(n_negative_prompts):
+ e_i_neg = noise_pred_neg[i::n_negative_prompts] - noise_pred_uncond
+ accum_grad += neg_guidance_weights[:, i].view(
+ -1, 1, 1, 1
+ ) * perpendicular_component(e_i_neg, e_pos)
+
+ noise_pred = noise_pred_uncond + self.cfg.guidance_scale * (
+ e_pos + accum_grad
+ )
+ else:
+ # pred noise
+ latent_model_input = torch.cat([latents_noisy] * 2, dim=0)
+ noise_pred = self.forward_unet(
+ latent_model_input,
+ torch.cat([t.reshape(1)] * 2).to(self.device),
+ encoder_hidden_states=text_embeddings,
+ )
+ # perform guidance (high scale from paper!)
+ noise_pred_text, noise_pred_uncond = noise_pred.chunk(2)
+ noise_pred = noise_pred_text + self.cfg.guidance_scale * (
+ noise_pred_text - noise_pred_uncond
+ )
+
+ return noise_pred
+
+ @torch.cuda.amp.autocast(enabled=False)
+ @torch.no_grad()
+ def guidance_eval(
+ self,
+ t_orig,
+ text_embeddings,
+ latents_noisy,
+ noise_pred,
+ use_perp_neg=False,
+ neg_guidance_weights=None,
+ ):
+ # use only 50 timesteps, and find nearest of those to t
+ self.scheduler.set_timesteps(self.cfg.n_ddim_steps)
+ self.scheduler.timesteps_gpu = self.scheduler.timesteps.to(self.device)
+ bs = (
+ min(self.cfg.max_items_eval, latents_noisy.shape[0])
+ if self.cfg.max_items_eval > 0
+ else latents_noisy.shape[0]
+ ) # batch size
+ large_enough_idxs = self.scheduler.timesteps_gpu.expand([bs, -1]) > t_orig[
+ :bs
+ ].unsqueeze(
+ -1
+ ) # sized [bs,50] > [bs,1]
+ idxs = torch.min(large_enough_idxs, dim=1)[1]
+ t = self.scheduler.timesteps_gpu[idxs]
+
+ fracs = list((t / self.scheduler.config.num_train_timesteps).cpu().numpy())
+ imgs_noisy = self.decode_latents(latents_noisy[:bs]).permute(0, 2, 3, 1)
+
+ # get prev latent
+ latents_1step = []
+ pred_1orig = []
+ for b in range(bs):
+ step_output = self.scheduler.step(
+ noise_pred[b : b + 1], t[b], latents_noisy[b : b + 1], eta=1
+ )
+ latents_1step.append(step_output["prev_sample"])
+ pred_1orig.append(step_output["pred_original_sample"])
+ latents_1step = torch.cat(latents_1step)
+ pred_1orig = torch.cat(pred_1orig)
+ imgs_1step = self.decode_latents(latents_1step).permute(0, 2, 3, 1)
+ imgs_1orig = self.decode_latents(pred_1orig).permute(0, 2, 3, 1)
+
+ latents_final = []
+ for b, i in enumerate(idxs):
+ latents = latents_1step[b : b + 1]
+ text_emb = (
+ text_embeddings[
+ [b, b + len(idxs), b + 2 * len(idxs), b + 3 * len(idxs)], ...
+ ]
+ if use_perp_neg
+ else text_embeddings[[b, b + len(idxs)], ...]
+ )
+ neg_guid = neg_guidance_weights[b : b + 1] if use_perp_neg else None
+ for t in tqdm(self.scheduler.timesteps[i + 1 :], leave=False):
+ # pred noise
+ noise_pred = self.get_noise_pred(
+ latents, t, text_emb, use_perp_neg, neg_guid
+ )
+ # get prev latent
+ latents = self.scheduler.step(noise_pred, t, latents, eta=1)[
+ "prev_sample"
+ ]
+ latents_final.append(latents)
+
+ latents_final = torch.cat(latents_final)
+ imgs_final = self.decode_latents(latents_final).permute(0, 2, 3, 1)
+
+ return {
+ "bs": bs,
+ "noise_levels": fracs,
+ "imgs_noisy": imgs_noisy,
+ "imgs_1step": imgs_1step,
+ "imgs_1orig": imgs_1orig,
+ "imgs_final": imgs_final,
+ }
+
+ def update_step(self, epoch: int, global_step: int, on_load_weights: bool = False):
+ # clip grad for stable training as demonstrated in
+ # Debiasing Scores and Prompts of 2D Diffusion for Robust Text-to-3D Generation
+ # http://arxiv.org/abs/2303.15413
+ if self.cfg.grad_clip is not None:
+ self.grad_clip_val = C(self.cfg.grad_clip, epoch, global_step)
+
+ if self.cfg.t_anneal:
+ percentage = (
+ float(global_step) / self.cfg.trainer_max_steps
+ ) # progress percentage
+ if type(self.cfg.max_step_percent) not in [float, int]:
+ max_step_percent = self.cfg.max_step_percent[1]
+ else:
+ max_step_percent = self.cfg.max_step_percent
+ curr_percent = (
+ max_step_percent - C(self.cfg.min_step_percent, epoch, global_step)
+ ) * (1 - percentage) + C(self.cfg.min_step_percent, epoch, global_step)
+ self.set_min_max_steps(
+ min_step_percent=curr_percent,
+ max_step_percent=curr_percent,
+ )
+ else:
+ self.set_min_max_steps(
+ min_step_percent=C(self.cfg.min_step_percent, epoch, global_step),
+ max_step_percent=C(self.cfg.max_step_percent, epoch, global_step),
+ )
diff --git a/threestudio/systems/__init__.py b/threestudio/systems/__init__.py
index edbe7bf2..cf351dfd 100644
--- a/threestudio/systems/__init__.py
+++ b/threestudio/systems/__init__.py
@@ -12,4 +12,5 @@
textmesh,
zero123,
zero123_simple,
+ sdi
)
diff --git a/threestudio/systems/sdi.py b/threestudio/systems/sdi.py
new file mode 100644
index 00000000..457d34bc
--- /dev/null
+++ b/threestudio/systems/sdi.py
@@ -0,0 +1,265 @@
+from dataclasses import dataclass, field
+
+import torch
+
+import threestudio
+from threestudio.systems.base import BaseLift3DSystem
+from threestudio.utils.ops import binary_cross_entropy, dot
+from threestudio.utils.typing import *
+import torch.nn.functional as F
+
+
+@threestudio.register("sdi-system")
+class ScoreDistillationViaInversion(BaseLift3DSystem):
+ @dataclass
+ class Config(BaseLift3DSystem.Config):
+ pass
+
+ cfg: Config
+
+ def configure(self):
+ # create geometry, material, background, renderer
+ super().configure()
+
+ def forward(self, batch: Dict[str, Any]) -> Dict[str, Any]:
+ render_out = self.renderer(**batch)
+ return {
+ **render_out,
+ }
+
+ def on_fit_start(self) -> None:
+ super().on_fit_start()
+ # only used in training
+ self.prompt_processor = threestudio.find(self.cfg.prompt_processor_type)(
+ self.cfg.prompt_processor
+ )
+ self.guidance = threestudio.find(self.cfg.guidance_type)(self.cfg.guidance)
+
+ def training_step(self, batch, batch_idx):
+ out = self(batch)
+ prompt_utils = self.prompt_processor()
+ guidance_out = self.guidance(
+ out["comp_rgb"], prompt_utils, **batch, rgb_as_latents=False
+ )
+
+ loss = 0.0
+
+ for name, value in guidance_out.items():
+ if not (type(value) is torch.Tensor and value.numel() > 1):
+ self.log(f"train/{name}", value)
+ if name.startswith("loss_"):
+ loss += value * self.C(self.cfg.loss[name.replace("loss_", "lambda_")])
+
+ if self.C(self.cfg.loss.lambda_orient) > 0:
+ if "normal" not in out:
+ raise ValueError(
+ "Normal is required for orientation loss, no normal is found in the output."
+ )
+ loss_orient = (
+ out["weights"].detach()
+ * dot(out["normal"], out["t_dirs"]).clamp_min(0.0) ** 2
+ ).sum() / (out["opacity"] > 0).sum()
+ self.log("train/loss_orient", loss_orient)
+ loss += loss_orient * self.C(self.cfg.loss.lambda_orient)
+
+ loss_sparsity_initial = (out["opacity"] ** 2 + 0.01)
+ loss_sparsity_sqrt = loss_sparsity_initial.sqrt()
+ loss_sparsity = F.relu(loss_sparsity_sqrt.mean())
+ self.log("train/loss_sparsity", loss_sparsity)
+ loss += loss_sparsity * self.C(self.cfg.loss.lambda_sparsity)
+
+ opacity_clamped = out["opacity"].clamp(1.0e-3, 1.0 - 1.0e-3)
+ loss_opaque = binary_cross_entropy(opacity_clamped, opacity_clamped)
+ self.log("train/loss_opaque", loss_opaque)
+ loss += loss_opaque * self.C(self.cfg.loss.lambda_opaque)
+
+ # z-variance loss proposed in HiFA: https://hifa-team.github.io/HiFA-site/
+ if "z_variance" in out and "lambda_z_variance" in self.cfg.loss:
+ loss_z_variance = out["z_variance"][out["opacity"] > 0.5].mean()
+ self.log("train/loss_z_variance", loss_z_variance)
+ loss += loss_z_variance * self.C(self.cfg.loss.lambda_z_variance)
+
+ for name, value in self.cfg.loss.items():
+ self.log(f"train_params/{name}", self.C(value))
+
+ return {"loss": loss}
+
+ def validation_step(self, batch, batch_idx):
+ out = self(batch)
+
+ with torch.no_grad():
+ pred_x0_latent, noisy_latent = self.guidance(
+ out["comp_rgb"], self.prompt_processor(), **batch, rgb_as_latents=False, test_call=True
+ )
+
+ # self.save_image_grid(
+ # f"it{self.true_global_step}-{batch['index'][0]}.png",
+ # [
+ # {
+ # "type": "rgb",
+ # "img": out["comp_rgb"][0],
+ # "kwargs": {"data_format": "HWC"},
+ # },
+ # ]
+ # + (
+ # [
+ # {
+ # "type": "rgb",
+ # "img": out["comp_normal"][0],
+ # "kwargs": {"data_format": "HWC", "data_range": (0, 1)},
+ # }
+ # ]
+ # if "comp_normal" in out
+ # else []
+ # )
+ # + [
+ # {
+ # "type": "grayscale",
+ # "img": out["opacity"][0, :, :, 0],
+ # "kwargs": {"cmap": None, "data_range": (0, 1)},
+ # },
+ # ],
+ # name="validation_step",
+ # step=self.true_global_step,
+ # )
+
+ self.save_image_grid(
+ f"it{self.true_global_step}-{batch['index'][0]}.png",
+ [
+ {
+ "type": "rgb",
+ "img": out["comp_rgb"][0],
+ "kwargs": {"data_format": "HWC"},
+ },
+ ]
+ + (
+ [
+ {
+ "type": "rgb",
+ "img": out["comp_normal"][0],
+ "kwargs": {"data_format": "HWC", "data_range": (0, 1)},
+ }
+ ]
+ if "comp_normal" in out
+ else []
+ )
+ + (
+ [
+ {
+ "type": "grayscale",
+ "img": out["depth_d"][0, :, :, 0],
+ "kwargs": {"cmap": None, "data_range": (0, 1)},
+ }
+ ]
+ if "depth_d" in out
+ else []
+ )
+ + (
+ [
+ {
+ "type": "rgb",
+ "img": add_channel_to_image( ( get_img_eigenvalues(out["depth_d"]) * out["opacity"] ) )[0],
+ "kwargs": {"data_format": "HWC", "data_range": (-.1, .1)},
+ }
+ ]
+ if "depth_d" in out
+ else []
+ )
+ + [
+ {
+ "type": "grayscale",
+ "img": out["opacity"][0, :, :, 0],
+ "kwargs": {"cmap": None, "data_range": (0, 1)},
+ },
+ ]
+ # + [
+ # {
+ # "type": "grayscale",
+ # "img": rescaled_nerf_depth[0, :, :, 0],
+ # "kwargs": {"cmap": None, "data_range": (0, 1)},
+ # },
+ # ]
+ + [
+ {
+ "type": "rgb",
+ "img": self.guidance.decode_latents(noisy_latent)[0].permute(1, 2, 0),
+ "kwargs": {"data_format": "HWC"},
+ },
+ ]
+ + [
+ {
+ "type": "rgb",
+ "img": self.guidance.decode_latents(pred_x0_latent)[0].permute(1, 2, 0),
+ "kwargs": {"data_format": "HWC"},
+ },
+ ]
+ + [
+ {
+ "type": "rgb",
+ "img": self.guidance.decode_latents(pred_x0_latent)[0].permute(1, 2, 0) - out["comp_rgb"][0],
+ "kwargs": {"data_format": "HWC"},
+ },
+ ]
+ + (
+ [
+ {
+ "type": "grayscale",
+ "img": predicted_depth[0, :, :, 0],
+ "kwargs": {"cmap": None, "data_range": (0, 1)},
+ }
+ ]
+ if "lambda_depth" in self.cfg.loss
+ else []
+ )
+ ,
+ name="validation_step",
+ step=self.true_global_step,
+ )
+
+
+ def on_validation_epoch_end(self):
+ pass
+
+ def test_step(self, batch, batch_idx):
+ out = self(batch)
+ self.save_image_grid(
+ f"it{self.true_global_step}-test/{batch['index'][0]}.png",
+ [
+ {
+ "type": "rgb",
+ "img": out["comp_rgb"][0],
+ "kwargs": {"data_format": "HWC"},
+ },
+ ]
+ + (
+ [
+ {
+ "type": "rgb",
+ "img": out["comp_normal"][0],
+ "kwargs": {"data_format": "HWC", "data_range": (0, 1)},
+ }
+ ]
+ if "comp_normal" in out
+ else []
+ )
+ + [
+ {
+ "type": "grayscale",
+ "img": out["opacity"][0, :, :, 0],
+ "kwargs": {"cmap": None, "data_range": (0, 1)},
+ },
+ ],
+ name="test_step",
+ step=self.true_global_step,
+ )
+
+ def on_test_epoch_end(self):
+ self.save_img_sequence(
+ f"it{self.true_global_step}-test",
+ f"it{self.true_global_step}-test",
+ "(\d+)\.png",
+ save_format="mp4",
+ fps=30,
+ name="test",
+ step=self.true_global_step,
+ )
From 5f10ad1d96f301a04348c4acadfd9672015ea490 Mon Sep 17 00:00:00 2001
From: Artem Lukoianov
Date: Sun, 1 Sep 2024 09:52:52 -0400
Subject: [PATCH 02/14] UPD: convexity loss
---
README.md | 3 +++
configs/sdi.yaml | 18 ++++++--------
requirements.txt | 2 +-
.../diffuse_with_point_light_material.py | 11 +++++----
threestudio/systems/sdi.py | 24 +++++++++++++++++++
5 files changed, 41 insertions(+), 17 deletions(-)
diff --git a/README.md b/README.md
index 2008f253..3174aa13 100644
--- a/README.md
+++ b/README.md
@@ -72,6 +72,9 @@ pip install ninja
pip install -r requirements.txt
```
+⚠️ Newer versions of diffusers can break the generation, please make sure you are using `diffusers==0.19.3`.⚠️
+
+
For additional options please address the official installation instructions of Threestudio [here](https://github.com/threestudio-project/threestudio?tab=readme-ov-file#installation) to install threestudio.
## Running generation
diff --git a/configs/sdi.yaml b/configs/sdi.yaml
index ddd98683..3f685a0e 100644
--- a/configs/sdi.yaml
+++ b/configs/sdi.yaml
@@ -43,10 +43,10 @@ system:
material_type: "diffuse-with-point-light-material"
material:
- ambient_only_steps: 10001
+ ambient_only_steps: 1000
albedo_activation: sigmoid
- diffuse_prob: 0.4 #0.75
- textureless_prob: 0.5
+ diffuse_prob: 0.3
+ textureless_prob: 0.75
background_type: "neural-environment-map-background"
background:
@@ -56,6 +56,7 @@ system:
renderer:
radius: ${system.geometry.radius}
num_samples_per_ray: 512
+ return_comp_normal: true
prompt_processor_type: "stable-diffusion-prompt-processor"
prompt_processor:
@@ -68,7 +69,7 @@ system:
pretrained_model_name_or_path: "stabilityai/stable-diffusion-2-1-base"
guidance_scale: 7.5
weighting_strategy: sds
- min_step_percent: 0.2
+ min_step_percent: 0.32
max_step_percent: 0.98
# SDI parameters
@@ -85,17 +86,12 @@ system:
name: None
loss:
- # lambda_sds: 1.
- # lambda_orient: [0, 10., 1000., 5000]
- # lambda_sparsity: 1.
- # lambda_opaque: 0.
lambda_sdi: 1.
lambda_orient: 0.1
lambda_sparsity: [0,0.15,0.,3000]
lambda_opaque: 0.1
- # lambda_z_variance: 1.
- # lambda_convex: [0,1.,0.1,4000]
- # lambda_convex_hess: 0.
+ lambda_convex: [0,1.,0.1,4000]
+ lambda_z_variance: 1.
optimizer:
name: Adam
diff --git a/requirements.txt b/requirements.txt
index 88706a6a..6d4d33a1 100644
--- a/requirements.txt
+++ b/requirements.txt
@@ -4,7 +4,7 @@ jaxtyping
typeguard
git+https://github.com/KAIR-BAIR/nerfacc.git@v0.5.2
git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
-diffusers<0.20
+diffusers==0.19.3 #<0.20
transformers==4.28.1
accelerate
opencv-python
diff --git a/threestudio/models/materials/diffuse_with_point_light_material.py b/threestudio/models/materials/diffuse_with_point_light_material.py
index abf06717..db9b08c0 100644
--- a/threestudio/models/materials/diffuse_with_point_light_material.py
+++ b/threestudio/models/materials/diffuse_with_point_light_material.py
@@ -91,11 +91,12 @@ def forward(
else:
shading = "diffuse"
else:
- if self.ambient_only:
- shading = "albedo"
- else:
- # return shaded color by default in evaluation
- shading = "diffuse"
+ shading = "albedo"
+ # if self.ambient_only:
+ # shading = "albedo"
+ # else:
+ # # return shaded color by default in evaluation
+ # shading = "diffuse"
# multiply by 0 to prevent checking for unused parameters in DDP
if shading == "albedo":
diff --git a/threestudio/systems/sdi.py b/threestudio/systems/sdi.py
index 457d34bc..7fec1cdf 100644
--- a/threestudio/systems/sdi.py
+++ b/threestudio/systems/sdi.py
@@ -13,6 +13,7 @@
class ScoreDistillationViaInversion(BaseLift3DSystem):
@dataclass
class Config(BaseLift3DSystem.Config):
+ convexity_res: int = 8
pass
cfg: Config
@@ -78,6 +79,29 @@ def training_step(self, batch, batch_idx):
loss_z_variance = out["z_variance"][out["opacity"] > 0.5].mean()
self.log("train/loss_z_variance", loss_z_variance)
loss += loss_z_variance * self.C(self.cfg.loss.lambda_z_variance)
+
+ # Naive convexity loss
+ if ("lambda_convex" in self.cfg.loss) and (self.C(self.cfg.loss.lambda_convex) > 1e-6):
+ downscaled_norms = F.interpolate(out["comp_normal"].permute(0, 3, 1, 2), [self.cfg.convexity_res, self.cfg.convexity_res], mode='bilinear', align_corners=False).permute(0, 2, 3, 1)
+
+ # Left-right
+ right_normals = downscaled_norms[:, :, 1: , :] # Pad and then remove the first column
+ left_normals = downscaled_norms[:, :, :-1, :] # Remove the last column to align with right_normals
+
+ h_cross_product = torch.cross(left_normals, right_normals, dim=-1)
+ h_sine_of_angle = h_cross_product[..., 2]
+
+ # Up-dowm
+ up_normals = downscaled_norms[:, :-1, :, :]
+ down_normals = downscaled_norms[:, 1: , :, :]
+
+ v_cross_product = torch.cross(down_normals, up_normals, dim=-1)
+ v_sine_of_angle = v_cross_product[..., 2]
+
+ loss_convexity = - (h_sine_of_angle.mean() + v_sine_of_angle.mean())
+ self.log("train/loss_convexity", loss_convexity)
+ loss += loss_convexity * self.C(self.cfg.loss.lambda_convex)
+
for name, value in self.cfg.loss.items():
self.log(f"train_params/{name}", self.C(value))
From 20a4df86be92847ded2174f803b8c28f5da2ca75 Mon Sep 17 00:00:00 2001
From: Artem Lukoianov
Date: Sun, 1 Sep 2024 12:15:15 -0400
Subject: [PATCH 03/14] NEW: add more prompt examples
---
README.md | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/README.md b/README.md
index 3174aa13..c3363eca 100644
--- a/README.md
+++ b/README.md
@@ -79,10 +79,17 @@ For additional options please address the official installation instructions of
## Running generation
The proccess of generating a shape is similar to the one described in the [threestudio](https://github.com/threestudio-project/threestudio?tab=readme-ov-file#quickstart) documentation.
-Make sure you are using the SDI config file, like below:
+Make sure you are using the SDI config file, like below.
+Here are a few examples with different prompts:
```sh
-python launch.py --config configs/sdi.yaml --train --gpu 0 system.prompt_processor.prompt="a zoomed out DSLR photo of a hamburger"
+python launch.py --config configs/sdi.yaml --train --gpu 0 system.prompt_processor.prompt="pumpkin head zombie, skinny, highly detailed, photorealistic"
+
+python launch.py --config configs/sdi.yaml --train --gpu 1 system.prompt_processor.prompt="a photograph of a ninja"
+
+python launch.py --config configs/sdi.yaml --train --gpu 2 system.prompt_processor.prompt="a zoomed out DSLR photo of a hamburger"
+
+python launch.py --config configs/sdi.yaml --train --gpu 3 system.prompt_processor.prompt="bagel filled with cream cheese and lox"
```
The results will be saved to `outputs/score-distillation-via-inversion/`.
From 65b88c641b77ac35bcdefad0b7ceac72ab8f65eb Mon Sep 17 00:00:00 2001
From: Artem Lukoianov
Date: Sun, 1 Sep 2024 12:16:21 -0400
Subject: [PATCH 04/14] UPD: readme
---
README.md | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/README.md b/README.md
index c3363eca..b8ba48e1 100644
--- a/README.md
+++ b/README.md
@@ -72,7 +72,7 @@ pip install ninja
pip install -r requirements.txt
```
-⚠️ Newer versions of diffusers can break the generation, please make sure you are using `diffusers==0.19.3`.⚠️
+⚠️ Newer versions of diffusers can break the generation, please make sure you are using `diffusers==0.19.3`.
For additional options please address the official installation instructions of Threestudio [here](https://github.com/threestudio-project/threestudio?tab=readme-ov-file#installation) to install threestudio.
From 5032fb39cf0bb057c5dbd41705f123278e160520 Mon Sep 17 00:00:00 2001
From: Artem Lukoianov
Date: Wed, 9 Oct 2024 15:42:14 -0400
Subject: [PATCH 05/14] UPD: readme
---
README.md | 7 ++++++-
..._skinny_highly_detailed_photorealistic.gif | Bin 0 -> 2206851 bytes
2 files changed, 6 insertions(+), 1 deletion(-)
create mode 100644 docs/Pumpkin_head_zombie_skinny_highly_detailed_photorealistic.gif
diff --git a/README.md b/README.md
index b8ba48e1..2bd0ed57 100644
--- a/README.md
+++ b/README.md
@@ -2,6 +2,11 @@
This is the official implementation of the paper
+
+
+ ⚠️ The code is in beta version, please report any issues to the authors ⚠️
+
+
Score Distillation via Reparametrized DDIM
@@ -18,7 +23,7 @@ Score Distillation via Reparametrized DDIM
-
+
diff --git a/docs/Pumpkin_head_zombie_skinny_highly_detailed_photorealistic.gif b/docs/Pumpkin_head_zombie_skinny_highly_detailed_photorealistic.gif
new file mode 100644
index 0000000000000000000000000000000000000000..81f5d40f704ea0e0f11f1606ff435783145f403f
GIT binary patch
literal 2206851
zcmeF1_d6S2*#9$02x1mBV%8RWme`w^jlH)jwH1vdh#hKEwMXqyMYT0syVO>#P;FIF
zRc&>8KF`1K{qcJ}@9X@0u5;bjdA-m5zOQ3sVyLR-i33dnep~?l6A%al0YNknD1#S?
zA*O(do{jlIJc~^U^^P6kRS`>e>
z6259>W?=O`$ktfNo}OSYKWT4c=b$X&V5D}<*ucqs%-LDj`N^ECfvOwA+ug~|opRmt
zTYwkN#M|85M_1Mtp6}~m>hEA1c$O86Fb@uK3<)4z5B3i|X}O^;8K!eN%-1EHvpf7o
z_)W~4h#>cf>gY&+_ZS=Hm>}P{0Qb1Sz__50xM2Udkf8rvH~izmqvN^{;{*NT!*9f|
ztS1EfC4vcwmz)ygf|5eKlfr|Nq9c;xZr!#Ny?yI8{$>b1=0>u)P;yKt!Bl_{;!B7Q
zB}U&M-MXD}D=sxUEX~z7&0U*}J0?fpAm2_V|0+x;L}$cDW+X&s5b&9C;hFf`naOck
z&I(!hm@Is97AZB0M9zlzX5(XWe??IU@szBrJi6k%_@KO`=sbK(evtVcccnXI@||?@
zox=12Rk;FJ`GUmA!faCEsZX)UOtCY%*bh^J4=>H8l;&oYN4b=z#+R2@S4PHF=I7tb
zB2*O>RFxN3!@{e}@@l3xYj0ZB*1W0T?`*oE-*lSWTv2iV=GWG?igvD^_SVi0#v2{F
zuREISAF!47Rgxb*njZRyfBf3}N%{VWw9&X&
zG|bGrm|1LDM7O`RGF!5lU#fb!oK9U^!*9rxZj5_v%if|YdF`|e?L54_8#2F3p4_u&
z+Z!q0f4_fdntzy3dDvBcl;(W&{ll?o=yB-amzL145#?Vg-QPldzx!05e?R}}j{jMh
z@$3B8@1MW_{QC2s{{H^^pZ@*(2e<$TT+mw@*xBmoU=8GD&|u(yJP7=RKA4>;k|pz9}4LTSJ5~$(xQhRqGk1piHxj!)qooIAtxR
z#=B~t=c2TA&p5>>0-E*>1yxut5wf)H{ICv-@Jy+QQK_;6#UTrBGWmCXCQXxzCeb{`*CxjSH^Cz8muDalQ`O{U
zg?X^Qn~8q5&uTJR=)*&*ceR5>f&u&99&xlCqF7>n<-XS6D>85eeN(Qu8;(WXF2^tzRtOt_yT%$VnqT&9A&Zt=Z|N;BS2~jx
zdRKIS9LEYx9T~}N8|_y>55PZY?$yY
zQ+bo{`C`r`d=_XB^{6H_Q({qTz9QXmEdeaA`a;vTMXV`-c3y>wU;2dJSpQ~555}jQ
z@7FoQhXEsJOFb`ssYT~7(jZ*bJpxzf{gV47BR5y#k8$mM%+}HgD|X<|Xk#SbS(>oI
z(Pce$cNBI%=JM>5wTS-^97{T-<6AqZAL
zPnEYzWxRpHUUE}?^}F5J=*PY<{jXuG60vnDI+;|$`Q&ps4%DE!Ln`WWZx(I%THBUd
zU6_de4wbQ{o!Q|u|I>Y8)FC#7J(e1`D71X$rI*aLPY=<-jIz-5kx(Zn`fydfaIF>b
zv(nE+9~Ai6lyoK1?BwiI|CqTUh9?R?D+QOOX>}n=
z0VV+J{MXj<0#7SyagBmch5D;G^H{lnBLZ|Gu{zvbC#fn<9`(Qr4*=|%*nP3OFO2nq
zLU1eIwy>U>OS}tuguN|7A>;(?SAR$D4;e}`Az`IoWRD7%@E3}yLsIe^M+!>MF2SIE>Y)27?`7k1&pGOFeG6}gY1ol02x!GFh!3UKHr!(M_w{8sfEiCz~5A`h@h$#n8WYp{gePhf>
zV+)gI<`CGf7S?69YU=P}mAoSY(NzP|oQ#!e7EF76znBZ1b2Y23@wz|x0>HeU(>erW
z3_-DvbzRqS3s4R~tOLrLhQs7}@WAVG>Nt8WAO7z+1_*=+_mW_uCzwN!bKlx}a?cH!
zbipTKL{z5**)evM7w<1LJNR>Rb-u5BsbWaz#Y)w{_p0Vr+=r4u0f(R7&s6>lx?DidbGeFR|o1z;XJ&!PDzj#Gt?zxQ2z@xIpQRPj1AKwv-K
zp;&A4%>yw*jm>Ipkf}HCGfLaajS_>b@bxiqlrB97Cc1|y#@~J}_md{TN5iTG(gQPO
z7eu#pFLkj%lgBZ=pM#$K`1{-cy1>QfvoJXCICYJCkdh+iL68y~WooRJT$G%VF*pO-
z@a9coRUbtZ-f-wCO9`3mi_m%=exMsHaHd=78@#dNAyE{>gH$^LsHuoBJ^aA%pCicw
zz{K|f_5#`pV|t%`61><0?<)iqrv~09LcSL%#1utZPXw{7C9=^bZLdl1#wfvdG;+dd
z)G?5Mf_kKG+@(`+Mis5vrE3TIN;SHgs}Av7hHR!}rIv@`8aN-ueUF%E$U%|Xi_Xx~
zPR7p^#G6hq*Fu1!B7q;LuxdnLWCL^z+qW$-JqK_s)w7$erKa}6v(e%5#>6-N#El78
z#JQvE9as_`W{maCy>wOW;_TY*n%EpPUhRqpkB(**N)E}I68PQq`l?U31G&v8bPP(6
zkJ8@mM7VRNzByrZC`MSas}tX+_8PO#)Tp}_a%!wHIb#zZklDdq@HSDOTSiud*sKtd
zV4H?mPL``OT`ZCwpxChBp*vajFgH3slg;y(DY6IewUkxPoSO9*$9)fwhZCegJ6PE3
zaUr4s1s-Zldn)D-pA2Cw$jn}3`hbs#QO&g-NlQ>A`asEg(OJ(qVewUD`v&mmZ(!*W
z7D*u37()xUCWrZ^RQ|}`?a%up=(ghBC+325KzyS{{sc3&?lUxa&`IEdPxf<#
z#Q@EEOy^`y?+abTX+ZRfuj_4NhY7N)06Zz?0Z1LC_j!c{P36vhE|39$m4>Il!t-uK
zul^Gk)iuC3)??u(L#huK(z&cc`u>Jt7^pnU5^
z04zLyp!jq-4YE_5v5=tgZU`e&&Sh
zqE-^<6`lb|LM{gXd1u0NdQI9Bgu;RxF|_#~V<7u+5)=&mE8%6PR{-+g
zf8lT-C$suYwvffYzh>Og^Dh)+_eKq
zBC~D%Kw)7>oif+bk9j=0yf?943)`v1ayQYzn#zFi%-GB6jcy`HaOTAnMIS?J
z{j+uwXO&-8U)H7L{;b{!2C$*PRio7Q92*+&h$(7W@N@$s8+&RPp{9s%$H9_~QZ6rE
z8z0{or(XMzjuJoEGD>cgr>jUkX@q^DC78C-1+*4!2HX3A1W&;1UIiJ?DaHWW?(bDa
z@ok;TJ{~5yz!RF$s!$1H#`c2qqz7I7{ivqF=EuYK51KQNmVt5pHNEjINsb*DlW-MN
zWGbM74(Ee6ZOy0xXM8ZxMnmmr!1IH3O9xeBCQZpft?OkP96?0ARoE+Tdxak1-Cqor
zYm9pth&NSwB4zEeB4Pco#*7_xHq5<${ViXeS_TKLWO-H~Rs+f2Q!5dGgIL2;K*ZV-sU$+QP&|e@t@bEJlC1Uzv^k&+UfBX3~B;_5-I{
zG=poEO?wnB!_$X*bbtF61AA9B9_y4p9+`fyMGwnH)1+U(!I=YO^@WPee$b1lPIvk)
zbIHD~^4=neV4qvP`eu*W9(HJC4%V+rgd}HB@r;nYuJ?EZ$}Hh;?4wt9oWHvsW$vg4
zYqS)H!5^p(RiKA{1U6DUTIOC2{rTClP~M?z`gqOkv8rUgU+}0%mR0r%G;<@?zK?8|
z`5*rk3m>Z(TOM?-KKc6TiTa->-}d@%dvL7<52{L9#j7KKT`p>~xn)_LbMc$eECsQO
zj&9BzRu$3~>Eu6Oc=(&b)Q}p}6c5ivJ-0U=<3+b*^!K3LHicRjBZKQ+Rtlz$Z;`!vr#*y{3
zz~95KdIz_E4gOF_2AK~J4uj+ZC*VAQumNwxkmV0+(CJu%7tctb=FE@gXTJkyZiX}}
zXVxmJXO$0kR?al`Ex~+GVsp8&9T!TZklAuo*OW9{;lNoXv8G(}xz*Xp^^m9OkjKf*
zbB~64f-CM_`fXFeW9UxG57!umm^()YPv3w6w*L$7Eu7vp;QFj$?}`^b3HN$6XEG=w
z*?@(<%@I|J=d-^@)_)h!-F>`;=-rY+0_b~b7hw@6v6a0PDIf?x1Yv>G-ac7`d$w){
zKV7@ro4?U(daHF}@TJ7#E8jF;>X?nL&CID>lz05hm@)}5j_XfV5Tc{TYd3B%?o${K
zkPM4e^aSP1bi5{=ZTm7a$%&Whr
zg4}5yzxh*eM`}tKu_!-tSEmJj7eGTq!=j~W^Z{Tr2GZO~f16>wZGmo_d;I&WRb=*h
zh6@EpZVvt1pLumLeK_d+@^i22U1hCW({@oO3d
z<1l-;{M<{l-GZHl0uARfwmEvZasrn`zj7*0rFHRll%6T>T=_e@;yX8^Shn@JIT3nw
zqgY_>D_g6MIUTbOok0bxZ2{iA@EU*d_jNNB{^sxN={obaDmWVQ81?wgoyV)Af8MNp
z8!7rS^2l-NwuwV&$cp3Wrmg1<4Lf5VJs1hQ`J@UlK%t+;ZZ@J3PpF9L6Y4_%gB_B6
zpft6!O+;@=f}R@PcTTs0VY>6Z^5o_?hT%seyR6)P?lGhBaOD85Pa
zd%U%IX<;U@R(`M}-DgZU1XhNH-Tu2qd`*|oNngA0`t=cf_w`;Sn$C)EJ^3+xMJF7Y
zOz)IKU$#Iu2H*B@%c2Qy8G@G>X(ICgJ!)-(70T>1VIUU_OqEJ+zW`(o12t0V5n70O
zDgxS$2&7P-P#B(|FRf7c9$tjeS7Q>v!@w(506QvJdl=Z-wteZCU0siMC5-U{mZ5cl
zr3AxN5PfGl#wD`98{7$s?4;{J(tj|J{*kbfPu=aUe9@Eesc&lwa`bj4arN?0<8SW6
z8QxJnw>R&OUN__Ea!+8E$9p?Rd&Co%#|f;t>YYg^{n``ymsRiHUO((#prc?%b+aD(
zKhhCNJci#DWw1C_eFVRa?^k&A?EKb3TKP6Zoke!#hmC~~OfuAaNAfZXjIXK~761&J
zRSZf0gx`0btYR3Ju?(ve8m1~JbS;VrbH{SvNa312
zgX2QwwUyKJs#ER0FA^84yT3HIp7Fjd1i;nZ->M#LX*fA3slQcMe`9{UXH!X+i=~}q
zq`ixW^$(vB(aAZT&cV-FLhA
zhXtibv-UfqHcAy>iAKx;P7?oZ%q{$ov1R;w4SBL&`t#11N=kV)y@$~Igbp2s#rdCIH(J=i5xXA*Y
z8;ZUf4Z7pke7dLVEY^c@e5`a9`oa_HVA9<3z(d$d4z|D@+N55@oDZJ^-m
zrMr!H68pUv{s5|*@-WHe++2#B;tO~HfJckOX}znT!7goxK;nsaLe{!{FaHR~^?34i
z6qbKz=hM6;JvJe9$@aB*#QSAOg==#ziow$r*_R($TviO2z9~+OmSkWDbIEF@aN8f)
z-M4Qun->rEz<+Gw%GHwZ
ztPLI%e||TPYI21yW_OXMp=5Lf%@Z)%Y}m-CL856Y>0R`@T$exiu1JuC9B#gFBgq+Z
z0r$<)B7RlV-b)zYm1|A-4wN?*NaK;`7s}OVnM~NPPHT&Z30S4-
zE@#=PL2AwdsKm27uz`!Bbs$!L(~|r72>f+;+JA91gI13vUGQMcn|$+GX-S_z+Bjpa
zNf)^L_Xl6%*3UuL0PXb-4xXA4exbU#V=5ou-U4
zKbK(c?;02U)NQt)%)m-H2PF%=pzbgGWUPe*_OPw%cBOE51$^vhGx%39UPQIzm@m@%
z$U`EF=fhIwc0x3kIY072h4X)|E=lz+ya#ieR~fnHqJF3<{Zg>L!yzV5VxN*-G39vc
zbHRxFTsYnQ-$;RnP!s_JrUWir~Td1BiCg)fFbP4i~-myJe#;Xx(MFPy*MRk*+S%By;2Dwx+A
zNn(8~BX6aWURS6ZGIs9cECUzHBx}E6Ro=oh_;NU#4hUt+u!*M>xslcW`8Z`=h!j}5
zWvZ2UrwUoV-&eHWW3p6{{&v-_T1)Invd#65$H(86awF^2SZ%wfYgR+#3^;bIzGli#
z?~~0fxdKC6v~SXM#IIDvX-@K<;IZAS986K`LOkD_D6dNe>y0cHc$ebgTP&pOz5^wVlPm(GE)Y!d$c;I7)RFEQ)9L+=wb-U)8kc>qfJ#8R&{0b1g4Wj`VDbv&+8Y8
z9wJ`SwJa9QPIODy3%(X1sLgV7*p}p}`Y}FU=1i6{jB*oX=NY8ZsZWi@@?ZSE%hOFs
z*-b>Sk^%nWPiH^=J$x4xLE{d
z;JKE)(p;ovd6thvva!3mFyJb6TO6WlO(Bh66H`NhcB+-V+uH7IqMj=gx}^g42x6O9
zLsOFi)+jr$-fi!9@6c=5vVZl4XJPa*QSoMvszwZy>P;DP)$HyvvzHn`-xn5?c0NPR
zHb;i|@y}x{oI3?MJlKQ;2Rlg$LketFcs)U6I2`Mun-Qsgg(aRf<4uYe#~SGa^Yvty
zTFNe&DCiy5pi(gqZItt;!W#pn$Bh)FnS1{VW|VttK8^KDLS&0vxC>GwJrs#L?=VIx
z^3FM-$QGV^B2|X`klZ+*mwdsQ>2JJh5c#@q9WQTElojn5qC%SVR-00(NnOREz?FwK
zQyGlsi%bbk6aCK(LT?t9yCj~~a^L!SWEZLY+K{I(awZbu(^n=dhwiKQs8|-?cqmg;$w$xMqkFJPQF$SrCWB;|497Y7rpqJ0
z4wdLQnX*S~`AS?+no{~XUfFMTxp%RzJu(Ge8G`i|@u^Cni$;3a5)Xj%fp28Jxey?>
ziln_6SY*4ND~*ktzGSiBA?+|`U?8G({dDH$a>K2d4rw+Z?e62KkrR9LsRWQ)&nLq_$v(GH
zytPwpdU1(HgICA?!;4uAA&H;l`!Fa^55q?z@{cjf3w@GqF#0E-efxL;wnI}V)yLF6@X~k_eud|CS_L&
zw-|8ygBeTogNZ9=n4#c*yBVQ#y_1MMvNoH=)yiL%(<8Mvu{W4-tkTQgKGmf|=8_i%6_Jn=Tr9A6pFm~;p2wo*ufod2#Ed5pGo?BD(Dl^1sIX3dQ6
zhd7U|YXF75kh{p^{O^T~G=pNR9$qYer@4Kj^z%jKk76-7n1~s?r<8$rzlnC?sX%1S
z2X0;|Mh!g7
zi_l+I?fj2VihlFw|L_u3n$=elmDdyf_vq2_O97rB5^h1oaq`kS0NPiUVm3_FY9iT?
zYqlt=CSedG9LaIG!3Ix~Q6=j4LmXf^Z{t$jN}$_aDU0-IgJCcR2QmJA_hyEwH?Fg0
zg9~%oNE%T|zJJwz3m>1
z{w1v-V6zo>#uH$-06lF*So7B0Z&y!nmn5Hc&EnVm&YPb@PC$`D%y$;1(OVr`nt0+if`NHz?$)9C06H<
zYvhy;0>f=^CmICB03b%>l#9hN2&RO{rLW_JWevXxxw=n~iS1bOfMTh`X6K}EiB4!H
zw7gu$A-3fmvyh~2hZ~a1sOsu4x9+Im=m^O4w-_h*u-%|v&7B1R_89dFNWICK_8^=#
zvT9VkV$}VoRLp)|B$DG3eNlMW^z!Hv?`q1)5~Z3`Pf$pksTm$br3u2&h%TjY#u~*Y
zRHvQ*;@6plx}HD&lNWIgOpMh2x5V;USO4OZHb8%J99c5*-_578X8$A=)j=I_q}x&{
znTQno{#o2ONxwu^HH>5yM$$Sb$n+DxRvDY%NcQ~=Z%sPXgi`N>nCqD6j0exEeS$iK
zL0rzRD$ze9it}Fh{u*&JpY=M_YlQ>ARJ^z%vatZ!(QEpCh{iRLh8L_17buGB)M|v#
zQDEY}Ek$(9V+BD)>w?X%Y@YY|=t*LLmuhBk2bmJWq_cIKH9jIfF0==442crBPp%C_4#K+b4F#r);aH
z>NsA#U2fql1?6(7WR0A~dFW6aNr^$C)Vcxp*2U60YE+0JY11kNuk?Eg2971>G)Z}x
zD9`3ky9ku$VmJ=>=3PHpc1YEi;Hs?XtgfFUgU^MZ2xwOt8%0H$iADgf-?BPK0OH2Y
ze(%j({B(hOXN~zenc}>l~2uhvaEu;Y|^S*sR^JTjIP^qKt<%2+YZa1y>
z2vSihqrj`Jcf75`)~Npsq$OpGlgCGdWopw-Pq04VC;38#^fE&t%_r`%
zzdh=3jABRkl-aS!**GI;rmd6ykq@3Fbj*pF4&VEmuH%7Ge-q89+Wfu~J1K5$#nn|Y
z7_uf!#E7L>c`O$fOxgDi>zAkKz*DX2pjN#ulJ_Cn{Y3l>K~@!Pgz3j{E$U^buvMhs
z&a>TOQ_OHKMu1Lr~__m;a==G_#{S_$pB&eHy`_tV#
zj#5$*Mf>=#Q@m=LMC|VN%UXLb&y~mRijUgmXD(sRK^RKPg@nz@u8BRo(pViwVvDp(
zdoYozIAjy{pFL8qs}r-|PU``JUAZ@4?36rY}!ZhWH3Hu(Ib
z7cEY#eiBRc+E1eg_H$DC`cUPSe}cTiA9OX%_2`UcAHYn
zf{bZYad_6$-LQ|)U8hG6h2DEbYjh8LIh?!`|E(x^CPdJ8UwQXpsoqb~8KMmXV<;x&
zlFQGWO&qGA=2Qi7iImJ+Eofan%Ws$Ff;*&e1X&nZO-!@wu;gtQ)TDFWF@PjpwxyRS
znVzY?U79i*Owv9Px|i-_X+~0I$qnRti*ENB(UiDgeY1PxBgby}hShhNYoM^J3H-&&
z9ht3i3Jhgq*+v&v9)?QGe@3}O(|sc3hNz|yev)$CLi^4AefOk1g?COjAMD>OJdNh0es`tX
z;GkFvSl9z3yx<4roialPbmt2~^{29l;HwJ_*W*FbPB=y$`%?481DX`25~BTZ%OW_X
z%hV-<*o&4(`KG$R^N(vo!F3}z0NwE3lxFCuX&dsmR>lMdVHWlf&o{w(V6&${ypse20N0K{4=3qD{{)=!o0QWfaVI>O_sTIz&UNXk>k>mx(6b1pCtsJ8
zjF}qt2YK)?u~e^E8u1ECqhI6~2dOsz((cV-H>n{nQqyxAC_%}o6jNz$0orTrzKAt=
zFswU>Xyy}OHeepko}0%#z}$^~?+_F9e$G
z6kGrn(+^ftRWzVX8AnDq4oCfMmsxTR*wE`i%tQn*o+KLGfBK4~%~ai%4k+{82@a^!
z*LH|JPn`j^<+`n27~eL9o21=DbI7$Kz2({pj_dZ6XhV^-u`LZn*>asH=Fb5!EznDKT$z)GnSbtn@8)n1>p!u{8rKo`I%
zQs+R`08PdfqA8wq;rm1OoS;`5EsO+T6(dO@3D@a7v>t&OzY#Ii1?`g{K`WB{BuH!_
zUTOU%z0dNUMgP1bE=x&;FR-^yfBZ;dN(c&7F`st=i=9_Wrg~EEd6K6=>QCMV$1dom
zgMtPZO|=?tImBVA{2$O-2ElJpo-FuRm8!PIplNSVX>P=(t-8-_5zImZl6@V1UQEJ-
zJy=R|e_n(;=SzJ*lGWQEp8xh#H0jF2v?~0vopa0v7p{m5ObuzoQG;hqdw+jTQ-(}LK9BXKuT2N
zAT8N#oc#CT-6W;j5OzsIciq^!j+82>R8!K){49%GB{XaAi_j!RP}>xwhyvgXB(8Ob
zd^{DIEdU}SW52iJy8Pz42Z$;iZ!{zKrMEB|x!
z?!|b^$Sn6{!pH1emuKBnPIjF
zw8nrflcv>Q;bo;YHv1gxS34XN_Mn|xnRcFl4E@9E(M5M~FiF0uJ*~oazsPq@!>sFk}<>A}npfsNFedaV~70raw%$&GBDEY&pL~Kj
z(ifJ$a6sGOt{YNO*WNF9sJ7192KIHI@#MPN{AO(XFbLy{Zs@%$DazrHuJL+0}z7cdgQZdLKIU3i2XHZo_*
zcQrF``UFAhi#I*=X5GiXq+>?TZf@(=L