You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Utilizing DeepSpeed requires model.hidden_size to be available to use auto values in zero optimization for zero.reduce_bucket_size. I'm guessing that config.decoder_embed_dim is the hidden_size.
So we'd just need to add the following to model.init
def __init__(self, config: RetNetConfig, embed_tokens: nn.Embedding = None):
super().__init__(config)
self.config = config
self.dropout_module = torch.nn.Dropout(config.dropout)
self.embed_dim = config.decoder_embed_dim
self.embed_scale = 1.0 if config.no_scale_embedding else math.sqrt(self.embed_dim)
## NEW CODE FOR DEEPSPEED
self.hidden_size = config.decoder_embed_dim
## NEW CODE FOR DEEPSPEED
The text was updated successfully, but these errors were encountered:
infosechoudini
changed the title
Add Hidden Size
Add Hidden Size for DeepSpeed integration
Oct 5, 2023
Utilizing DeepSpeed requires
model.hidden_size
to be available to use auto values in zero optimization forzero.reduce_bucket_size
. I'm guessing thatconfig.decoder_embed_dim
is the hidden_size.So we'd just need to add the following to
model.init
The text was updated successfully, but these errors were encountered: