Skip to content

Commit

Permalink
Expand supported attention head sizes (#752)
Browse files Browse the repository at this point in the history
There's no reason for current attention head sizes restrictions - we
theoretically can support any size with current implementations. This
patch fixes that.
  • Loading branch information
kzawora-intel authored Jan 29, 2025
1 parent 446eab2 commit 2d152ed
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion vllm/attention/ops/hpu_paged_attn.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ class HPUPagedAttention:

@staticmethod
def get_supported_head_sizes() -> List[int]:
return [64, 80, 96, 112, 128, 256]
return list(range(1, 257))

@staticmethod
def get_kv_cache_shape(
Expand Down

0 comments on commit 2d152ed

Please sign in to comment.