Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ragged paged attention #8659

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

vanbasten23
Copy link
Collaborator

@vanbasten23 vanbasten23 commented Jan 31, 2025

Test plan:

LIBTPU_INIT_ARGS=--xla_tpu_scoped_vmem_limit_kib=65536  python /workspaces/persist/pytorch/xla/test/test_ragged_paged_attention_kernel.py 2>&1 | tee out.txt

cc: @miladm

@bythew3i
Copy link

Test plan:

LIBTPU_INIT_ARGS=--xla_tpu_scoped_vmem_limit_kib=65536  python /workspaces/persist/pytorch/xla/test/test_ragged_paged_attention_kernel.py 2>&1 | tee out.txt

How is 65536 calculated?

@vanbasten23
Copy link
Collaborator Author

Test plan:

LIBTPU_INIT_ARGS=--xla_tpu_scoped_vmem_limit_kib=65536  python /workspaces/persist/pytorch/xla/test/test_ragged_paged_attention_kernel.py 2>&1 | tee out.txt

How is 65536 calculated?

I found a ticket and someone uses it. I remember the number is the vmem limit on a TPU generation.

@vanbasten23 vanbasten23 force-pushed the xiowei/add_ragged_paged_attention branch from ad2f87c to 9e4b227 Compare February 1, 2025 00:32
@vanbasten23 vanbasten23 force-pushed the xiowei/add_ragged_paged_attention branch from 9e4b227 to 7fe5071 Compare February 3, 2025 05:41
@vanbasten23 vanbasten23 requested review from lsy323 and miladm February 3, 2025 21:31
@vanbasten23
Copy link
Collaborator Author

Build and test / CPU tests / test (benchmark_tests) failure is irrelevant to this PR. (A PR #8668 without any changes also fails this)

@miladm
Copy link
Collaborator

miladm commented Feb 3, 2025

cc onduty @lsy323 to assist with the CI test failure before we merge @vanbasten23

@@ -37,7 +37,8 @@ run_xla_hlo_debug python3 "$TEST_CDIR/scan/test_scan_debug.py"
python3 "$TEST_CDIR/test_pallas.py" -v
python3 "$TEST_CDIR/test_pallas_spmd.py"
XLA_DISABLE_FUNCTIONALIZATION=1 python3 "$TEST_CDIR/test_pallas_spmd.py"
python3 "$TEST_CDIR/test_tpu_paged_attention_kernel.py"
python3 "$TEST_CDIR/test_multi_queries_paged_attention_kernel.py"
Copy link
Collaborator

@miladm miladm Feb 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this line renamed? do you need it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants