Skip to content

Commit

Permalink
[Bugfix] Fix logit soft cap in flash-attn backend (vllm-project#7425)
Browse files Browse the repository at this point in the history
  • Loading branch information
WoosukKwon authored Aug 12, 2024
1 parent d2bc451 commit cfba4de
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions vllm/attention/backends/flash_attn.py
Original file line number Diff line number Diff line change
Expand Up @@ -563,6 +563,7 @@ def forward(
softmax_scale=self.scale,
causal=True,
alibi_slopes=self.alibi_slopes,
softcap=self.logits_soft_cap,
).squeeze(1)

# Reshape the output tensor.
Expand Down

0 comments on commit cfba4de

Please sign in to comment.