-
-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Feature]: Enable CUDA Graph without turn on torch.compile / Inductor for V1
feature request
New feature or request
#15896
opened Apr 1, 2025 by
houseroad
1 task done
[Bug]: TP with external_launcher is not working with vLLM version 0.8.0 and above
bug
Something isn't working
#15895
opened Apr 1, 2025 by
toslali-ibm
1 task done
[Performance]: 0.8.1 vs 0.7.4dev122 R1 H20 performance benchmark test,0.8.1 What is the reason for the 14% performance improvement(throughput tokens/s)
performance
Performance-related issues
#15881
opened Apr 1, 2025 by
chuanyi-zjc
1 task done
[Feature]: Fused MoE config for Nvidia RTX 3090
feature request
New feature or request
#15880
opened Apr 1, 2025 by
davidsyoung
1 task done
[Bug]: CPU offload not working for vllm serve
bug
Something isn't working
#15877
opened Apr 1, 2025 by
hamaadtahiir
1 task done
[Doc]: What version of vllm and lmcache does that example use https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/cpu_offload_lmcache.py
documentation
Improvements or additions to documentation
#15874
opened Apr 1, 2025 by
tanov25
1 task done
[Bug]: building docker from Dockerfile
bug
Something isn't working
#15872
opened Apr 1, 2025 by
surak
1 task done
[Bug]: CPU offload not working for DeepSeek-V2-Lite-Chat
bug
Something isn't working
#15871
opened Apr 1, 2025 by
ymcki
1 task done
[Bug]: gemma3-27b in 0.8.2 and 0.8.1 generate speed with compile is slower than 0.8.0 without compile
bug
Something isn't working
#15870
opened Apr 1, 2025 by
hanggun
1 task done
[Performance]: Qwen2.5VL preprocessing extremely slow with large image, leading low gpu usage
performance
Performance-related issues
#15869
opened Apr 1, 2025 by
tenacioustommy
1 task done
[Bug]: FP8 accuracy decreases with long inputs
bug
Something isn't working
#15865
opened Apr 1, 2025 by
fan-niu
1 task done
[Bug]: qwen2.5-omni model failed to start
bug
Something isn't working
#15864
opened Apr 1, 2025 by
hackerHiJu
1 task done
[Installation]: Model checkpoint shards reloading every time on Kubernetes with vLLM image (even if already downloaded)
installation
Installation problems
#15862
opened Apr 1, 2025 by
Prashantsaini25
1 task done
Does VLLM support structured pruning?
usage
How to use vllm
#15854
opened Apr 1, 2025 by
wangwenmingaa
1 task done
[Bug]: [V1] Testla T4 cannot work for V1
bug
Something isn't working
#15853
opened Apr 1, 2025 by
maobaolong
1 task done
[New Model]: nomic-ai/nomic-embed-text-v2-moe and nvidia/NV-Embed-v2
new model
Requests to new models
#15849
opened Apr 1, 2025 by
RohitRathore1
1 task done
[Bug]: served-model-name not being returned in model field of response
bug
Something isn't working
#15845
opened Apr 1, 2025 by
nbertagnolli
1 task done
[Bug]: ImportError: _flash_supports_window_size missing for baichuan-inc/Baichuan-M1-14B-Instruct (with trust_remote_code=True) in vLLM v0.8.2
bug
Something isn't working
#15844
opened Apr 1, 2025 by
Lee-Ju-Yeong
1 task done
[Feature]: A Hacked Classifier Free Guidance Metho
feature request
New feature or request
#15839
opened Mar 31, 2025 by
MSLDCherryPick
1 task done
[Bug]: Gemma-3 (27B) can't load save_pretrained() checkpoint: AssertionError: expected size 5376==2560, stride 1==1 at dim=0
bug
Something isn't working
#15836
opened Mar 31, 2025 by
BiEchi
1 task done
[Bug]: [TPU] V1 seems to silently crash after a while
bug
Something isn't working
#15833
opened Mar 31, 2025 by
kiratp
1 task done
[Bug]: Docker build takes more than 5000 seconds
bug
Something isn't working
#15827
opened Mar 31, 2025 by
HadiSDev
[Bug]: Gemma 3 27B IT Model Doesn't Read Image (Responds To Text Only)
bug
Something isn't working
#15825
opened Mar 31, 2025 by
dawnik17
1 task done
[Bug]: Failed to run an GPTQModel quantized model with vLLM
bug
Something isn't working
#15817
opened Mar 31, 2025 by
Maglanyulan
1 task done
Previous Next
ProTip!
no:milestone will show everything without a milestone.