Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Hidden Size for DeepSpeed integration #23

Open
infosechoudini opened this issue Oct 5, 2023 · 2 comments
Open

Add Hidden Size for DeepSpeed integration #23

infosechoudini opened this issue Oct 5, 2023 · 2 comments

Comments

@infosechoudini
Copy link

Utilizing DeepSpeed requires model.hidden_size to be available to use auto values in zero optimization for zero.reduce_bucket_size. I'm guessing that config.decoder_embed_dim is the hidden_size.

So we'd just need to add the following to model.init

    def __init__(self, config: RetNetConfig, embed_tokens: nn.Embedding = None):
        super().__init__(config)
        self.config = config

        self.dropout_module = torch.nn.Dropout(config.dropout)

        self.embed_dim = config.decoder_embed_dim
        self.embed_scale = 1.0 if config.no_scale_embedding else math.sqrt(self.embed_dim)

        ## NEW CODE FOR DEEPSPEED
        self.hidden_size = config.decoder_embed_dim
        ## NEW CODE FOR DEEPSPEED


@infosechoudini infosechoudini changed the title Add Hidden Size Add Hidden Size for DeepSpeed integration Oct 5, 2023
@infosechoudini
Copy link
Author

Correction: it needs to be added to the configuration.json and configuration.py

@syncdoth
Copy link
Owner

syncdoth commented Oct 8, 2023

Could you please make it a PR? I am not using DEEPSPEED at the moment and can't confirm where the changes must occur.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants