Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potentially incorrect equation from paper #3

Open
david-stojanovski opened this issue Jan 11, 2023 · 5 comments
Open

Potentially incorrect equation from paper #3

david-stojanovski opened this issue Jan 11, 2023 · 5 comments

Comments

@david-stojanovski
Copy link

Equation 16. from the paper which gives the disentangled component seems to differ to what is actually in the code.

In the paper the equation is given as:

output(image | labelmap) + s * (output(image | labelmap) - output(image | null_label))

However looking at the code in /guided_diffusion/gaussian_diffusion.py within the p_mean_variance function there is the code below:

model_output_zero = model(x, self._scale_timesteps(t), y=th.zeros_like(model_kwargs['y']))
model_output[:, :3] = model_output_zero[:, :3] + model_kwargs['s'] * (model_output[:, :3] - model_output_zero[:, :3])

This seems to be calculating the following instead though:

output(image | null_label) + s * (output(image | labelmap) - output(image | null_label)).

Am I understanding this correctly or is this a bug in the paper/code?

@valvgab-bh
Copy link

@WeilunWang thanks a lot for your code, it is a really nice work! :)

Coming to the issue, I also find the implementation differs from what described in the paper.
If we try to re-arrange the elements in the equation, we get:

model_output = output(image | null_label) + s * (output(image | labelmap) - output(image | null_label))
             = [...] 
             = output(image | labelmap)  - s' * (output(image | labelmap) - output(image | null_label))

where s' = 1 - s.

So, the sign in front of the parenthesis has changed. This means that instead of increasing by the distance from the model bias output(image | null_label), we are going in the opposite direction? Could you clarify this, please? :)

@obaghirli
Copy link

I believe it is a bug in the paper, not in the code.

@obaghirli
Copy link

Figure 3. (c) in the paper is in sync with the code.

@HuangChiEn
Copy link

it seems the author keeps the golden principle of programming: "If it works, don't touch it (don't try to understand it)" www

@LexieYang
Copy link

Hi, does anyone know the name of the argparse parameter for guidance scale, s? When I debugging the code, the following if statement is false,

if 's' in model_kwargs and model_kwargs['s'] > 1.0: # FALSE
            model_output_zero = model(x, self._scale_timesteps(t), y=th.zeros_like(model_kwargs['y']))
            model_output[:, :3] = model_output_zero[:, :3] + model_kwargs['s'] * (model_output[:, :3] - model_output_zero[:, :3])

In this case, the classifier-free guidance is not functional at all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants