Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Style modulation layers, style parameters and controlling low-level features. #34

Open
007prateekd opened this issue May 29, 2022 · 2 comments

Comments

@007prateekd
Copy link

007prateekd commented May 29, 2022

Hey, great work!
I had a couple of queries:

  1. In the paper it is mentioned that there are 26 style modulation layers, but in the code it seems to be 18 as n_latent = 18.
  2. What exactly do the style parameters s(w) correspond to in the code?
  3. For the pertained styles, is there any way to control low-level features like eyes, nose, etc. without training again? I know that while fine-tuning we can use blending (using RIS) and different masks for controlling them but is there any way for a model which is already fine-tuned?
  4. I see a change in results when fine-tuning the model using JoJo's photo. Is it because of e4e being used instead of ReStyle in the code?
@mchong6
Copy link
Owner

mchong6 commented May 30, 2022

  1. Some layers share the same latent code (like torgb layers) thus even though there are 18 latent codes, there are 26 style modulation layers.
  2. It corresponds to the output of this
    style = self.modulation(style)
  3. A way to do that is to load the original model and fine interpolate between the finetuned/original features as described in the feature interpolation section of the paper. To specifically control eyes/nose/etc you can specifically interpolate those section by cropping those spatial regions within the features. Look at https://arxiv.org/abs/2111.01619
  4. There have been several versions with several different hyperparameters. Thus you might not get the exact same result.

@007prateekd
Copy link
Author

  1. So if I understood it correctly, then there are 17 StyledConv layers with independent codes and 9 ToRGB layers with shared codes. Hence, 26 layers and 18 codes in total.

  2. Quoting your paper,

    We GAN invert the reference style image y to obtain a style code w = T(y) and from that a set of s parameters s(w).

    So, by obtaining s(w), do you mean implicitly obtaining them through that line of code you mentioned? Because in the fine-tuning code there doesn't seem to be any explicit reference to obtain these parameters.

  3. Great. Will definitely give it a shot.

  4. That makes sense.

@007prateekd 007prateekd changed the title Regarding style modulation layers, style parameters and controlling low-level features. Style modulation layers, style parameters and controlling low-level features. Feb 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants