vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 6.6k
Star 43.3k

Code
Issues 1.5k
Pull requests 545
Discussions
Actions
Projects 7
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q2 2025

#15735 opened Mar 29, 2025 by simon-mo

Open 1

[V1] Feedback Thread

#12568 opened Jan 30, 2025 by simon-mo

Open 81

Labels 45 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,549 Open 6,044 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Feature]: Enable CUDA Graph without turn on torch.compile / Inductor for V1 feature request

New feature or request

#15896 opened Apr 1, 2025 by houseroad

1 task done

[Bug]: TP with external_launcher is not working with vLLM version 0.8.0 and above bug

Something isn't working

#15895 opened Apr 1, 2025 by toslali-ibm

1 task done

[Performance]: 0.8.1 vs 0.7.4dev122 R1 H20 performance benchmark test，0.8.1 What is the reason for the 14% performance improvement(throughput tokens/s) performance

Performance-related issues

#15881 opened Apr 1, 2025 by chuanyi-zjc

1 task done

[Feature]: Fused MoE config for Nvidia RTX 3090 feature request

New feature or request

#15880 opened Apr 1, 2025 by davidsyoung

1 task done

[Bug]: CPU offload not working for vllm serve bug

Something isn't working

#15877 opened Apr 1, 2025 by hamaadtahiir

1 task done

[Doc]: What version of vllm and lmcache does that example use https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/cpu_offload_lmcache.py documentation

Improvements or additions to documentation

#15874 opened Apr 1, 2025 by tanov25

1 task done

[Bug]: building docker from Dockerfile bug

Something isn't working

#15872 opened Apr 1, 2025 by surak

1 task done

[Bug]: CPU offload not working for DeepSeek-V2-Lite-Chat bug

Something isn't working

#15871 opened Apr 1, 2025 by ymcki

1 task done

[Bug]: gemma3-27b in 0.8.2 and 0.8.1 generate speed with compile is slower than 0.8.0 without compile bug

Something isn't working

#15870 opened Apr 1, 2025 by hanggun

1 task done

[Performance]: Qwen2.5VL preprocessing extremely slow with large image, leading low gpu usage performance

Performance-related issues

#15869 opened Apr 1, 2025 by tenacioustommy

1 task done

[Bug]: FP8 accuracy decreases with long inputs bug

Something isn't working

#15865 opened Apr 1, 2025 by fan-niu

1 task done

[Bug]: qwen2.5-omni model failed to start bug

Something isn't working

#15864 opened Apr 1, 2025 by hackerHiJu

1 task done

[Installation]: Model checkpoint shards reloading every time on Kubernetes with vLLM image (even if already downloaded) installation

Installation problems

#15862 opened Apr 1, 2025 by Prashantsaini25

1 task done

[Bug]: CI flake - v1/engine/test_llm_engine.py::test_parallel_sampling[True] bug

Something isn't working

ci/build v1

#15855 opened Apr 1, 2025 by markmc

1 task done

Does VLLM support structured pruning? usage

How to use vllm

#15854 opened Apr 1, 2025 by wangwenmingaa

1 task done

[Bug]: [V1] Testla T4 cannot work for V1 bug

Something isn't working

#15853 opened Apr 1, 2025 by maobaolong

1 task done

[New Model]: nomic-ai/nomic-embed-text-v2-moe and nvidia/NV-Embed-v2 new model

Requests to new models

#15849 opened Apr 1, 2025 by RohitRathore1

1 task done

[Bug]: served-model-name not being returned in model field of response bug

Something isn't working

#15845 opened Apr 1, 2025 by nbertagnolli

1 task done

[Bug]: ImportError: _flash_supports_window_size missing for baichuan-inc/Baichuan-M1-14B-Instruct (with trust_remote_code=True) in vLLM v0.8.2 bug

Something isn't working

#15844 opened Apr 1, 2025 by Lee-Ju-Yeong

1 task done

[Feature]: A Hacked Classifier Free Guidance Metho feature request

New feature or request

#15839 opened Mar 31, 2025 by MSLDCherryPick

1 task done

[Bug]: Gemma-3 (27B) can't load save_pretrained() checkpoint: AssertionError: expected size 5376==2560, stride 1==1 at dim=0 bug

Something isn't working

#15836 opened Mar 31, 2025 by BiEchi

1 task done

[Bug]: [TPU] V1 seems to silently crash after a while bug

Something isn't working

#15833 opened Mar 31, 2025 by kiratp

1 task done

[Bug]: Docker build takes more than 5000 seconds bug

Something isn't working

#15827 opened Mar 31, 2025 by HadiSDev

[Bug]: Gemma 3 27B IT Model Doesn't Read Image (Responds To Text Only) bug

Something isn't working

#15825 opened Mar 31, 2025 by dawnik17

1 task done

[Bug]: Failed to run an GPTQModel quantized model with vLLM bug

Something isn't working

#15817 opened Mar 31, 2025 by Maglanyulan

1 task done

Previous 1 2 3 4 5 … 61 62 Next

Previous Next

ProTip! no:milestone will show everything without a milestone.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly