-
Notifications
You must be signed in to change notification settings - Fork 936
Open
Labels
🙋♀️ help & questionsExtra attention is neededExtra attention is needed
Description
Pre-checks
- I searched existing issues and discussions
What problem are you trying to solve?
The Vulkan backend does not support the UPSCALE operator, which results in slow inference speeds for certain models (such as Qwen3VL).
warmup: *****************************************************************
warmup: WARNING: the CLIP graph uses unsupported operators by the backend
warmup: the performance will be suboptimal
warmup: list of unsupported ops (backend=Vulkan0):
warmup: UPSCALE: type = f32, ne = [92 92 1024 1]
warmup: flash attention is enabled
warmup: please report this on github as an issue
warmup: ref: https://github.com/ggml-org/llama.cpp/pull/16837#issuecomment-3461676118
warmup: *****************************************************************
What would you like NexaSDK to do?
Add support for the UPSCALE operator in Vulkan backend.Alternatives you've considered
Who does this help, and how much?
Additional context
Refer to this link:https://github.com/ggml-org/llama.cpp/pull/18327#event-21729873367Metadata
Metadata
Assignees
Labels
🙋♀️ help & questionsExtra attention is neededExtra attention is needed