v0.10
Changelog:
- add predicted noise mode (reconstruction / rec) from this comment AUTOMATIC1111/stable-diffusion-webui#736
- add prompt schedule for rec
- add cfg scale for rec
- add captions support to rec prompt
- add source selector for rec noise
- add v1/v2 support for rec noise
- add single controlnet support for rec noise
- add multi controlnet to rec noise
- add rec steps % option
- add rec noise to gui
- add TemporalNet from https://huggingface.co/CiaraRowles/TemporalNet
- add temporalnet source selector (init/stylized)
- skip temporalnet for 1st frame
- add masked guidance toggle to gui
- add masked diffusion toggle to gui
- add softclamp to gui
- add temporalnet settings to gui
- add controlnet annotator settings to gui
- hide sat_scale (causes black screen)
- hide inpainting model-specific settings
- hide instructpix2pix-scpecific settings
TemporalNet
TemporalNet is a ControlNet, which is intended to be used for stabilizing img2img series by using the previous frame as input. You can use it together with other ControlNets in MultiControlNet mode.
GUI -> Controlnet - temporalnet_source
It uses the previous frame for better temporal stability. You can either use raw init video or stylized frame as input.
GUI -> Controlnet - temporalnet_skip_1st_frame
It will not work during the 1st frame by default, because it relies on the previous frame. You can force it to be enabled during the 1st frame render by disabling this checkbox, it will then use the 1st raw video frame as its image input.
Reconstructed noise
(similar to img2img alternative script from AUTO1111)
GUI -> diffusion -> reconstructed noise
Diffusion models generate (denoise) images from random noise. During the img2img process, we inject our init image somewhere halfway through the denoising, so we need to add some random noise on top of it for the model to reconstruct it correctly.
This introduces temporal inconsistency due to the randomness of the noise we add.
One of the most elegant solutions to this problem is using reconstructed noise instead of a random one. We take our init image, reverse sample it with our model, and get a very specific noise pattern. Given out model and settings this noise can be used to exactly reconstruct the init image. This means, that if we use this noise during our img2img process, the temporal inconsistency should be much lower than with random noise, as it's strongly related to our init image.
use_predicted_noise
: enable the feature. won't work with fixed_code enabled
Rec Prompt:
reconstruction prompt. You need this prompt to describe the scene without the style you're applying in the main prompt. Can use {caption}
For example, if my main prompt is "a beautiful village by salvador dali", the reconstruction prompt should be "a beautiful village".
rec_cfg
- reconstruction cfg_cale. keep low, between 1-1.9
rec_randomness
- add random noise to reconstructed noise. 0 - no random, 1 - full random, no rec noise
rec_source
- image used to reconstruct the noise from.
rec_steps_pct
- % of the current frame's total steps used for reconstruction. 1 - 100%, most accurate/slowest. 0.5 is okay for starters.
GUI updates
Model-specific settings are now only shown when using the respective model version. For example, image_scale and inpainting options are not visible unless you use instructpix2pix or inpainting model.
Added numerous frequently used settings to GUI. Some of the rarely used settings (like masked guidance and masked callback inner tweaks) are still left out of the GUI to keep it relatively clean.
A reminder:
Changes in GUI will not be saved into the notebook, but if you run it with new settings, they will be saved to a settings.txt file as usual.
You can load settings in the misc tab.
You do not need to rerun the GUI cell after changing its settings.