-
Notifications
You must be signed in to change notification settings - Fork 14.5k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
feat: Add file descriptor based model loading for Android SAF support
ggml
changes relating to the ggml tensor library for machine learning
#18870
opened Jan 15, 2026 by
Siddhesh2377
Loading…
convert_hf_to_gguf.py: refactor modify_tensors to call super
python
python script changes
#18866
opened Jan 15, 2026 by
am17an
Loading…
sampling : update outdated comment about has_sampled [no ci]
#18863
opened Jan 15, 2026 by
danbev
Loading…
ggml-cpu: aarm64: q5_K repack gemm and gemv (and generic) implementations (i8mm)
ggml
changes relating to the ggml tensor library for machine learning
#18860
opened Jan 15, 2026 by
Alcpz
Loading…
ggml-cpu: add RVV vec dot kernels for quantization types
ggml
changes relating to the ggml tensor library for machine learning
#18859
opened Jan 15, 2026 by
rehan-10xengineer
Loading…
ggml-cpu: add q4_0 repack support for wasm
ggml
changes relating to the ggml tensor library for machine learning
enforce response_format and json_schema for Kimi K2
testing
Everything test related
#18851
opened Jan 15, 2026 by
akoumjian
Loading…
Deepseek v3.2 dense attention support from @fairydreaming
python
python script changes
#18849
opened Jan 14, 2026 by
createthis
Loading…
# [RFC] Integrate sparse-ternary-fma for TQ2_0 quantization
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
#18836
opened Jan 14, 2026 by
HyperFoldUK
Loading…
vulkan: Revert forced full subgroup for FlashAttention
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#18831
opened Jan 14, 2026 by
rillomas
Loading…
ggml-backend: Separate dynamic lib install and search paths, add relative search
ggml
changes relating to the ggml tensor library for machine learning
#18817
opened Jan 13, 2026 by
DaAwesomeP
Loading…
HIP: tune mmq/rocblas switching for RDNA4
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#18816
opened Jan 13, 2026 by
jiachengjason
Loading…
sampling : remove sampling branching in output_reserve
#18811
opened Jan 13, 2026 by
danbev
Loading…
llama: fix integer type consistency in split helpers
#18798
opened Jan 13, 2026 by
MaheshJakkala
Loading…
Unified delta net handling for Qwen3Next and Kimi Linear models
model
Model specific
#18792
opened Jan 12, 2026 by
pwilkin
Loading…
server: fix memory reservations in populate_token_probs
examples
server
#18787
opened Jan 12, 2026 by
l-austenfeld
Loading…
ggml-cpu: add RVV vec dot kernels for quantization types
ggml
changes relating to the ggml tensor library for machine learning
#18784
opened Jan 12, 2026 by
taimur-10x
•
Draft
webui : send both backend_sampling == false/true
examples
server
#18781
opened Jan 12, 2026 by
ggerganov
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.