Release v0.10 · Sxela/WarpFusion

Changelog:

add predicted noise mode (reconstruction / rec) from this comment AUTOMATIC1111/stable-diffusion-webui#736
add prompt schedule for rec
add cfg scale for rec
add captions support to rec prompt
add source selector for rec noise
add v1/v2 support for rec noise
add single controlnet support for rec noise
add multi controlnet to rec noise
add rec steps % option
add rec noise to gui
add TemporalNet from https://huggingface.co/CiaraRowles/TemporalNet
add temporalnet source selector (init/stylized)
skip temporalnet for 1st frame
add masked guidance toggle to gui
add masked diffusion toggle to gui
add softclamp to gui
add temporalnet settings to gui
add controlnet annotator settings to gui
hide sat_scale (causes black screen)
hide inpainting model-specific settings
hide instructpix2pix-scpecific settings

TemporalNet

TemporalNet is a ControlNet, which is intended to be used for stabilizing img2img series by using the previous frame as input. You can use it together with other ControlNets in MultiControlNet mode.

GUI -> Controlnet - temporalnet_source
It uses the previous frame for better temporal stability. You can either use raw init video or stylized frame as input.

GUI -> Controlnet - temporalnet_skip_1st_frame
It will not work during the 1st frame by default, because it relies on the previous frame. You can force it to be enabled during the 1st frame render by disabling this checkbox, it will then use the 1st raw video frame as its image input.

Reconstructed noise

(similar to img2img alternative script from AUTO1111)

GUI -> diffusion -> reconstructed noise
Diffusion models generate (denoise) images from random noise. During the img2img process, we inject our init image somewhere halfway through the denoising, so we need to add some random noise on top of it for the model to reconstruct it correctly.

This introduces temporal inconsistency due to the randomness of the noise we add.

One of the most elegant solutions to this problem is using reconstructed noise instead of a random one. We take our init image, reverse sample it with our model, and get a very specific noise pattern. Given out model and settings this noise can be used to exactly reconstruct the init image. This means, that if we use this noise during our img2img process, the temporal inconsistency should be much lower than with random noise, as it's strongly related to our init image.

use_predicted_noise: enable the feature. won't work with fixed_code enabled

Rec Prompt:

reconstruction prompt. You need this prompt to describe the scene without the style you're applying in the main prompt. Can use {caption}

For example, if my main prompt is "a beautiful village by salvador dali", the reconstruction prompt should be "a beautiful village".

rec_cfg - reconstruction cfg_cale. keep low, between 1-1.9
rec_randomness - add random noise to reconstructed noise. 0 - no random, 1 - full random, no rec noise
rec_source - image used to reconstruct the noise from.
rec_steps_pct - % of the current frame's total steps used for reconstruction. 1 - 100%, most accurate/slowest. 0.5 is okay for starters.

GUI updates

Model-specific settings are now only shown when using the respective model version. For example, image_scale and inpainting options are not visible unless you use instructpix2pix or inpainting model.

Added numerous frequently used settings to GUI. Some of the rarely used settings (like masked guidance and masked callback inner tweaks) are still left out of the GUI to keep it relatively clean.

A reminder:

Changes in GUI will not be saved into the notebook, but if you run it with new settings, they will be saved to a settings.txt file as usual.
You can load settings in the misc tab.
You do not need to rerun the GUI cell after changing its settings.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.10