Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix][Core] Fix BlockManagerV2 when Encoder Input is None #9103

Merged
merged 60 commits into from
Oct 7, 2024

Conversation

sroy745
Copy link
Contributor

@sroy745 sroy745 commented Oct 6, 2024

This PR addresses an issue encountered with certain encoder-decoder models, e.g. Llama-3.2-11B-Vision-Instruct, where encoderInput is set to None. This situation arises in the following test cases:

tests/models/encoder_decoder/vision_language/test_mllama.py::test_models[5-128-bfloat16-sizes0-meta-llama/Llama-3.2-11B-Vision-Instruct]
tests/models/encoder_decoder/vision_language/test_mllama.py::test_models[5-128-bfloat16-sizes4-meta-llama/Llama-3.2-11B-Vision-Instruct]

Currently, these tests function correctly with BlockManagerV1 because, when encoder_input is None during the allocation of cross_block_table, an empty BlockTable is created without any blocks. However, this behavior does not hold for BlockManagerV2, where the addition of a BlockTable and attempts to allocate blocks for an empty or None encoder_input lead to assertion failures.

We make the following changes in this PR:

  1. For cases where encoder_input is None, we create an empty BlockTable, similar to the behavior in BlockManagerV1, but without adding any blocks to the BlockTable.
  2. We have removed two assertions in BlockManagerV2 that pertain to free blocks and the calculation of physical_block_ids. These assertions check whether the BlockTable is empty, which is valid given the use case. The removal of these assertions will not impact the functionality of the BlockManager.
  3. With these changes, we are re-enabling BlockManagerV2 for encoder-decoder models.

FIX #9099
FIX #9084

sroy745 added 30 commits May 28, 2024 20:39
Copy link

github-actions bot commented Oct 6, 2024

👋 Hi! Thank you for contributing to the vLLM project.
Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can do one of these:

  • Add ready label to the PR
  • Enable auto-merge.

🚀

@sroy745 sroy745 marked this pull request as draft October 6, 2024 04:22
@sroy745 sroy745 changed the title [WIP][BugFix][Core] Fix BlockManagerV2 when Encoder Input is None [BugFix][Core] Fix BlockManagerV2 when Encoder Input is None Oct 6, 2024
@sroy745
Copy link
Contributor Author

sroy745 commented Oct 6, 2024

@comaniac / @heheda12345 PTAL when you get a chance

Copy link
Collaborator

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

vllm/core/block_manager_v2.py Outdated Show resolved Hide resolved
@comaniac comaniac marked this pull request as ready for review October 6, 2024 19:12
@comaniac comaniac added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 6, 2024
Co-authored-by: Cody Yu <hao.yu.cody@gmail.com>
@comaniac comaniac enabled auto-merge (squash) October 6, 2024 20:02
@heheda12345
Copy link
Collaborator

LGTM too.

auto-merge was automatically disabled October 6, 2024 23:56

Head branch was pushed to by a user without write access

@comaniac comaniac enabled auto-merge (squash) October 7, 2024 00:03
@comaniac comaniac merged commit c8f26bb into vllm-project:main Oct 7, 2024
60 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready ONLY add when PR is ready to merge/full CI is needed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Llama-3.2-11B-Vision-Instruct which is an encoder-decoder model fails with BlockManager V2
3 participants