Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement simple character level GPT and the trainer for it #122

Merged
merged 3 commits into from
Aug 2, 2024
Merged

Conversation

Aisuko
Copy link
Member

@Aisuko Aisuko commented Aug 2, 2024

class SimpleGPT(nn.Module):
    def __init__(self, vocab_size):
        super().__init__()
        # each token directly reads off the logits for the next token from the lookup table
        self.token_embedding_table = nn.Embedding(vocab_size, vocab_size)

    
    def forward(self, idx, targets=None):
        # idx and targets are both(B,T) tensors of integers
        logits=self.token_embedding_table(idx) # (B,T,C)

        if targets is None:
            loss=None
        else:
            B,T,C=logits.shape
            logits=logits.view(B*T, C)
            targets=targets.view(B*T) # B*T it also ok here
            loss=F.cross_entropy(logits, targets)
        return logits, loss
    
    def generate(self, idx, max_new_tokens):
        # idx is (B,T) array of indices in the current context
        for _ in range(max_new_tokens):
            # get the predictions
            logits, loss=self(idx) # call forward automatically
            # focus only on the last time step
            logits=logits[:,-1,:] # becomes (B,C)
            # apply softmax to get probabilities
            probs=F.softmax(logits, dim=-1) # (B,C)
            # sample from the distribution
            idx_next=torch.multinomial(probs, num_samples=1) #(B,1)
            # append sampled index to the running sequence
            idx=torch.cat((idx, idx_next), dim=1) # (B,T+1)
        return idx

Aisuko added 2 commits August 2, 2024 02:06
Signed-off-by: Aisuko <urakiny@gmail.com>
Signed-off-by: Aisuko <urakiny@gmail.com>
@Aisuko Aisuko self-assigned this Aug 2, 2024
Signed-off-by: Aisuko <urakiny@gmail.com>
@Aisuko Aisuko requested a review from Micost August 2, 2024 08:48
@Aisuko Aisuko merged commit ddbad75 into main Aug 2, 2024
6 checks passed
@Aisuko Aisuko deleted the fix/dev branch August 2, 2024 08:48
@Aisuko Aisuko mentioned this pull request Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant