Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about decoding topk_masking #3

Open
violet-sto opened this issue May 10, 2024 · 1 comment
Open

about decoding topk_masking #3

violet-sto opened this issue May 10, 2024 · 1 comment

Comments

@violet-sto
Copy link

Hi

Thanks for your excellent work. I have a question about the rate schedule for topk_masking.

As described in the appendix, "To ensure that the degree of noise decreases as the generation process proceeds, we schedule k to increase from 1 to N monotonically as the diffusion step t goes from T to 1." However, in the code (

lowest_k_mask = topk_masking(_scores_for_topk, cutoff_len, stochastic=False)
), the masked k tokens with the lowest confidence instead of the highest. Are there any inconsistencies here?

Best regards

@LZhengisme
Copy link
Collaborator

Hi there,

Thanks for reaching out! In

sorted_index = _scores.sort(-1)[0]
cutoff = sorted_index.gather(dim=-1, index=cutoff_len) + 1e-10
# cutoff_len = k -> select k + 1 tokens
masking = _scores < cutoff
return masking
, the topk_masking function actually returns a mask to indicate the unselected elements. This is essentially the inverse of selecting the highest elements; we implement this way to simplify subsequent calculations for denoising tokens.

Hope this clears things up!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants