Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[Docs] [QeRL] Layerwise Reloading Documentation documentation Improvements or additions to documentation
#40317 opened Apr 20, 2026 by kylesayrs Contributor Loading…
[Docs] Fix thinking_token_budget docs documentation Improvements or additions to documentation
#40316 opened Apr 20, 2026 by milesial Contributor Loading…
fix: Do not make function calls when request has no tools for /v1/responses frontend ready ONLY add when PR is ready to merge/full CI is needed
#40314 opened Apr 20, 2026 by terrytangyuan Contributor Loading…
[Model] Use AutoWeightsLoader for GPT2
#40312 opened Apr 20, 2026 by cben484 Loading…
[Bugfix] Align chat completion tool_choice/tools validation with OpenAI spec bug Something isn't working frontend tool-calling
#40311 opened Apr 20, 2026 by sfeng33 Collaborator Loading…
[Bugfix] Fix W4A8_FP8 MoE tp>1 correctness and view() TypeError bug Something isn't working
#40310 opened Apr 20, 2026 by EdalatiAli Contributor Draft
4 tasks
[QeRL] Add warnings for extra memory buffering
#40309 opened Apr 20, 2026 by kylesayrs Contributor Loading…
[Bugfix] Fix hybrid KV manager for quantized per-token-head KV cache bug Something isn't working v1
#40308 opened Apr 20, 2026 by lesj0610 Loading…
[ROCm][CI] Fix trust_remote_code AttributeError in EAGLE3 acceptance length test rocm Related to AMD ROCm speculative-decoding v1
#40306 opened Apr 19, 2026 by AndreasKaratzas Collaborator Loading…
1 task done
feat: mla prefill trtllm static fp8 output
#40304 opened Apr 19, 2026 by carlyou Contributor Draft
4 tasks
[Bug] Fix shm_broadcast PyCFunction descriptor corruption under JIT loads bug Something isn't working
#40303 opened Apr 19, 2026 by jsboige Loading…
1 of 4 tasks
[ROCm][Bugfix] Fall back when Quark MoE AITER dispatch is unsupported bug Something isn't working rocm Related to AMD ROCm
#40300 opened Apr 19, 2026 by Bortlesboat Contributor Draft
Implement incremental streaming for minimax_m2_tool_parser tool-calling
#40298 opened Apr 19, 2026 by frankie-ys Contributor Loading…
1 of 4 tasks
[Platform][XPU] Opt-in integrated-GPU override for unified memory intel-gpu Related to Intel GPU
#40295 opened Apr 19, 2026 by MegaStood Loading…
[Bugfix] Fix GPTQ not picking up desc_asc from model config bug Something isn't working
#40294 opened Apr 19, 2026 by bajceta Loading…
4 tasks done
[ROCm] [Release] Clean-up Release Pipeline ci/build rocm Related to AMD ROCm
#40293 opened Apr 19, 2026 by tjtanaa Collaborator Loading…
4 tasks
[ROCm][ViT] Detect Triton-AMD kernels at their new aiter location rocm Related to AMD ROCm
#40289 opened Apr 19, 2026 by Lafunamor Loading…
[Bugfix] Fix dataset name and path argument validation bug in vllm bench serve bug Something isn't working performance Performance-related issues verified Run pre-commit for new contributors without triggering other tests
#40288 opened Apr 19, 2026 by talorabr Contributor Loading…
4 tasks
[XPU] Update nixl to v0.10.1 in Dockerfile ci/build intel-gpu Related to Intel GPU kv-connector
#40287 opened Apr 19, 2026 by zhenwei-intel Contributor Draft
4 tasks
Fix DP internal LB first-request routing drift v1
#40285 opened Apr 19, 2026 by Abhicodeitout Loading…
3 of 4 tasks
Add Granite 4.1 Vision as built-in multimodal model documentation Improvements or additions to documentation multi-modality Related to multi-modality (#4194) new-model Requests to new models
#40282 opened Apr 19, 2026 by artem-spector Loading…
4 tasks done
[Feat]Make EPLB max expert redundancy configurable
#40280 opened Apr 19, 2026 by nainiu258 Loading…
ProTip! Exclude everything labeled bug with -label:bug.