-
Notifications
You must be signed in to change notification settings - Fork 20.1k
Pull requests: ggml-org/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
spacemit : fix wrong transpose function for int16 data
ggml
changes relating to the ggml tensor library for machine learning
#25161
opened Jun 30, 2026 by
I3eg1nner
Loading…
opencl: initial q1_0 support
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
convert: add --temp-dir option for custom temp file location
conversion
#25155
opened Jun 30, 2026 by
ayaan-7505
Loading…
ggml: imatrix-aware NVFP4 quantization (scale search) + wire NVFP4 ftype
examples
ggml
changes relating to the ggml tensor library for machine learning
#25153
opened Jun 30, 2026 by
avifenesh
Loading…
common, server : preserve HF file for cached models
server
#25152
opened Jun 29, 2026 by
mrexodia
Loading…
CUDA: add COL2IM_1D op
CUDA
Related to the CUDA backend
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
#25151
opened Jun 29, 2026 by
Ssamdeman
Loading…
CUDA: fix Gemma E4B MTP FlashAttention
CUDA
Related to the CUDA backend
ggml
changes relating to the ggml tensor library for machine learning
#25148
opened Jun 29, 2026 by
JohannesGaessler
Contributor
Loading…
speculative: fix MTP draft crash on vision inputs
#25144
opened Jun 29, 2026 by
ServeurpersoCom
Contributor
•
Draft
ggml-webgpu: add support for NVFP4
ggml
changes relating to the ggml tensor library for machine learning
WebGPU
#25143
opened Jun 29, 2026 by
yomaytk
Contributor
Loading…
model : register t_layer_inp for qwen3next
model
Model specific
#25141
opened Jun 29, 2026 by
jschmied
Loading…
common,server: handle bracketed IPv6 literals in URL authority
server
#25140
opened Jun 29, 2026 by
ServeurpersoCom
Contributor
Loading…
ui: strip path and weight extension from model id in single model mode
server/ui
#25137
opened Jun 29, 2026 by
ServeurpersoCom
Contributor
Loading…
llama : add position-relocatable KV range save/load
testing
Everything test related
#25133
opened Jun 29, 2026 by
Anyesh
Loading…
ui: add sync blocks so display/behavior settings can be set via --ui-config-file
server/ui
#25132
opened Jun 29, 2026 by
ServeurpersoCom
Contributor
Loading…
[SYCL] enhance argsort to support all UT cases
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#25125
opened Jun 29, 2026 by
arthw
Contributor
Loading…
[SYCL] fix unsupport ACC UT cases for noncontiguous
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#25124
opened Jun 29, 2026 by
arthw
Contributor
Loading…
ggml-cpu: fix NEON build compilation on 32-bit ARMv7 architectures without hardware FP16
ggml
changes relating to the ggml tensor library for machine learning
#25119
opened Jun 29, 2026 by
Smu1zel
Loading…
opencl: add ABS op
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
OpenCL
Issues specific to the OpenCL backend
#25115
opened Jun 29, 2026 by
Gezahegne
Loading…
Add Vision Support for Minimax-M3
conversion
model
Model specific
mtmd
Related to multimodal functionality (video/image/audio)
testing
Everything test related
#25113
opened Jun 28, 2026 by
timkhronos
Contributor
•
Draft
Granite-Switch Architecture
conversion
model
Model specific
#25107
opened Jun 28, 2026 by
barvhaim
Loading…
CUDA: fix get_rows_back for tables with more than 65535 rows (grid-y clamp + stride)
CUDA
Related to the CUDA backend
ggml
changes relating to the ggml tensor library for machine learning
merge ready
A maintainer can use this label to indicate that they consider the changes final and ready to merge.
testing
Everything test related
#25103
opened Jun 28, 2026 by
mattjallo
Loading…
server : auto-insert media marker in embedding / multimodal prompts
server
#25093
opened Jun 28, 2026 by
TheOneWhoWill
Loading…
sycl: fix check_graph_compatibility() to allow graphs for MoE decode (CONCAT dim!=3, MUL_MAT_ID fused path)
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#25089
opened Jun 28, 2026 by
Captain-Tripps
Loading…
5 tasks done
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.