-
-
Notifications
You must be signed in to change notification settings - Fork 15.8k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Docs] [QeRL] Layerwise Reloading Documentation
documentation
Improvements or additions to documentation
#40317
opened Apr 20, 2026 by
kylesayrs
Contributor
Loading…
[Docs] Fix thinking_token_budget docs
documentation
Improvements or additions to documentation
#40316
opened Apr 20, 2026 by
milesial
Contributor
Loading…
fix: Do not make function calls when request has no tools for /v1/responses
frontend
ready
ONLY add when PR is ready to merge/full CI is needed
#40314
opened Apr 20, 2026 by
terrytangyuan
Contributor
Loading…
Revert "Fix MoE backend selection for LoRA (unquantized MoE)" (#40273)
#40313
opened Apr 20, 2026 by
vllm-agent
•
Draft
[Bugfix] Align chat completion tool_choice/tools validation with OpenAI spec
bug
Something isn't working
frontend
tool-calling
#40311
opened Apr 20, 2026 by
sfeng33
Collaborator
Loading…
[Bugfix] Fix W4A8_FP8 MoE tp>1 correctness and view() TypeError
bug
Something isn't working
#40310
opened Apr 20, 2026 by
EdalatiAli
Contributor
•
Draft
4 tasks
[QeRL] Add warnings for extra memory buffering
#40309
opened Apr 20, 2026 by
kylesayrs
Contributor
Loading…
[Bugfix] Fix hybrid KV manager for quantized per-token-head KV cache
bug
Something isn't working
v1
#40308
opened Apr 20, 2026 by
lesj0610
Loading…
[ROCm][CI] Fix Related to AMD ROCm
speculative-decoding
v1
trust_remote_code AttributeError in EAGLE3 acceptance length test
rocm
#40306
opened Apr 19, 2026 by
AndreasKaratzas
Collaborator
Loading…
1 task done
[Bug] Fix shm_broadcast PyCFunction descriptor corruption under JIT loads
bug
Something isn't working
#40303
opened Apr 19, 2026 by
jsboige
Loading…
1 of 4 tasks
[ROCm][Bugfix] Fall back when Quark MoE AITER dispatch is unsupported
bug
Something isn't working
rocm
Related to AMD ROCm
#40300
opened Apr 19, 2026 by
Bortlesboat
Contributor
•
Draft
Register parsed config classes before tokenizer init
needs-rebase
#40299
opened Apr 19, 2026 by
Bortlesboat
Contributor
•
Draft
Implement incremental streaming for minimax_m2_tool_parser
tool-calling
#40298
opened Apr 19, 2026 by
frankie-ys
Contributor
Loading…
1 of 4 tasks
[Platform][XPU] Opt-in integrated-GPU override for unified memory
intel-gpu
Related to Intel GPU
#40295
opened Apr 19, 2026 by
MegaStood
Loading…
[Bugfix] Fix GPTQ not picking up desc_asc from model config
bug
Something isn't working
#40294
opened Apr 19, 2026 by
bajceta
Loading…
4 tasks done
[ROCm] [Release] Clean-up Release Pipeline
ci/build
rocm
Related to AMD ROCm
#40293
opened Apr 19, 2026 by
tjtanaa
Collaborator
Loading…
4 tasks
[ROCm][ViT] Detect Triton-AMD kernels at their new aiter location
rocm
Related to AMD ROCm
#40289
opened Apr 19, 2026 by
Lafunamor
Loading…
[Bugfix] Fix dataset name and path argument validation bug in vllm bench serve
bug
Something isn't working
performance
Performance-related issues
verified
Run pre-commit for new contributors without triggering other tests
#40288
opened Apr 19, 2026 by
talorabr
Contributor
Loading…
4 tasks
[XPU] Update nixl to v0.10.1 in Dockerfile
ci/build
intel-gpu
Related to Intel GPU
kv-connector
#40287
opened Apr 19, 2026 by
zhenwei-intel
Contributor
•
Draft
4 tasks
Fix DP internal LB first-request routing drift
v1
#40285
opened Apr 19, 2026 by
Abhicodeitout
Loading…
3 of 4 tasks
Add Granite 4.1 Vision as built-in multimodal model
documentation
Improvements or additions to documentation
multi-modality
Related to multi-modality (#4194)
new-model
Requests to new models
#40282
opened Apr 19, 2026 by
artem-spector
Loading…
4 tasks done
[Model] gemma4 core: quantized inference + gguf loading + fused moe gelu_tanh
cpu
Related to CPU backends
nvidia
#40281
opened Apr 19, 2026 by
lesj0610
Loading…
Previous Next
ProTip!
Exclude everything labeled
bug with -label:bug.