Skip to content

Actions: neuralmagic/vllm

clang-format

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
248 workflow runs
248 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

[TPU] Correctly profile peak memory usage & Upgrade PyTorch XLA (#9438)
clang-format #174: Commit 211fe91 pushed by tlrmchlsmth
October 30, 2024 13:13 21s main
October 30, 2024 13:13 21s
[CI][Bugfix] Skip chameleon for transformers 4.46.1 (#9808)
clang-format #173: Commit ab6f981 pushed by mgoin
October 29, 2024 19:05 16s main
October 29, 2024 19:05 16s
[Model] Add BNB quantization support for Mllama (#9720)
clang-format #172: Commit 09500f7 pushed by tlrmchlsmth
October 29, 2024 14:37 21s main
October 29, 2024 14:37 21s
Hqq support
clang-format #171: Pull request #21 synchronize by ElizaWszola
October 25, 2024 13:55 21s hqq-support
October 25, 2024 13:55 21s
Hqq support
clang-format #170: Pull request #21 synchronize by ElizaWszola
October 25, 2024 12:27 18s hqq-support
October 25, 2024 12:27 18s
Hqq support
clang-format #169: Pull request #21 synchronize by ElizaWszola
October 25, 2024 06:55 20s hqq-support
October 25, 2024 06:55 20s
Hqq support
clang-format #168: Pull request #21 synchronize by ElizaWszola
October 24, 2024 15:25 19s hqq-support
October 24, 2024 15:25 19s
Hqq support
clang-format #167: Pull request #21 synchronize by ElizaWszola
October 24, 2024 14:24 19s hqq-support
October 24, 2024 14:24 19s
Hqq support
clang-format #166: Pull request #21 synchronize by ElizaWszola
October 24, 2024 14:21 22s hqq-support
October 24, 2024 14:21 22s
[Misc] Separate total and output tokens in benchmark_throughput.py (#…
clang-format #165: Commit fd0e2cf pushed by tlrmchlsmth
October 23, 2024 17:06 17s main
October 23, 2024 17:06 17s
Hqq support
clang-format #164: Pull request #21 synchronize by ElizaWszola
October 23, 2024 14:16 19s hqq-support
October 23, 2024 14:16 19s
[Misc] Make benchmarks use EngineArgs (#9529)
clang-format #163: Commit cb6fdaa pushed by tlrmchlsmth
October 22, 2024 22:40 17s main
October 22, 2024 22:40 17s
[V1] Implement vLLM V1 [1/N] (#9289)
clang-format #162: Commit 6c5af09 pushed by ElizaWszola
October 22, 2024 11:15 19s main
October 22, 2024 11:15 19s
[Frontend][Misc] Goodput metric support (#9338)
clang-format #161: Commit 855e0e6 pushed by tlrmchlsmth
October 20, 2024 20:08 19s main
October 20, 2024 20:08 19s
[Model] Support Pixtral models in the HF Transformers format (#9036)
clang-format #160: Commit 3921a2f pushed by tlrmchlsmth
October 18, 2024 19:45 19s main
October 18, 2024 19:45 19s
[Frontend][Feature] Add jamba tool parser (#9154)
clang-format #159: Commit d2b1bf5 pushed by tlrmchlsmth
October 18, 2024 12:52 20s main
October 18, 2024 12:52 20s
Add notes on the use of Slack (#9442)
clang-format #158: Commit dbfa8d3 pushed by tlrmchlsmth
October 17, 2024 13:29 16s main
October 17, 2024 13:29 16s
DO NOT MERGE : Layer-by-Layer Profiling
clang-format #157: Pull request #3 synchronize by LucasWilkinson
October 16, 2024 19:36 20s varun/main-with-profiler
October 16, 2024 19:36 20s
Support mistral interleaved attn (#9414)
clang-format #156: Commit 415f76a pushed by tlrmchlsmth
October 16, 2024 15:47 18s main
October 16, 2024 15:47 18s
[Bugfix][CI/Build] Fix CUDA 11.8 Build (#9386)
clang-format #155: Commit 717a5f8 pushed by dsikka
October 16, 2024 01:56 17s main
October 16, 2024 01:56 17s
[Kernel] adding fused moe kernel config for L40S TP4 (#9245)
clang-format #154: Commit f710090 pushed by tlrmchlsmth
October 11, 2024 16:09 20s main
October 11, 2024 16:09 20s
[Misc][LoRA] Support loading LoRA weights for target_modules in reg f…
clang-format #153: Commit 36ea790 pushed by tlrmchlsmth
October 11, 2024 13:14 20s main
October 11, 2024 13:14 20s
[torch.compile] integration with compilation control (#9058)
clang-format #152: Commit e4d652e pushed by tlrmchlsmth
October 10, 2024 20:05 17s main
October 10, 2024 20:05 17s
[Bugfix] Fix lm_head weights tying with lora for llama (#9227)
clang-format #151: Commit 07c11cf pushed by tlrmchlsmth
October 10, 2024 14:02 16s main
October 10, 2024 14:02 16s
DO NOT MERGE : Layer-by-Layer Profiling
clang-format #150: Pull request #3 synchronize by LucasWilkinson
October 9, 2024 21:32 14s varun/main-with-profiler
October 9, 2024 21:32 14s