Commits · natasa365/whisper.cpp

Unified return format

e408c7b

3v324v23 commited on Sep 26, 2025

fix cmd

6c8a230

3v324v23 commited on Sep 26, 2025

fix cmd

a1a7aac

3v324v23 commited on Sep 26, 2025

fix build

1b9b118

3v324v23 commited on Sep 26, 2025

fix build

d937c3a

3v324v23 commited on Sep 26, 2025

fix dockerfile path

03ff3a5

3v324v23 commited on Sep 26, 2025

add meta

36ff0ea

3v324v23 commited on Sep 26, 2025

chore: track binaries with git-lfs

aa000f7

3v324v23 commited on Sep 26, 2025

chore: track binaries with git-lfs

f33d63d

3v324v23 commited on Sep 26, 2025

add sync task

46ebeba

3v324v23 commited on Sep 26, 2025

Handle negative value in padding (#3389)

6e115ac
unverified

Treboko commited on Aug 24, 2025

models : update`./models/download-ggml-model.cmd` to allow for tdrz download (#3381)

0b65831
unverified

Thea Mukhi

danbev commited on Aug 24, 2025

talk-llama : sync llama.cpp

4321600

ggerganov commited on Aug 18, 2025

sync : ggml

a0af6fc

ggerganov commited on Aug 18, 2025

ggml: Add initial WebGPU backend (llama/14521)

4b3da1d

Reese Levine commited on Aug 18, 2025

ggml : initial zDNN backend (llama/14975)

6dd510c

taronaeo commited on Aug 18, 2025

common : handle mxfp4 enum

fd4c0e1

ggerganov commited on Aug 18, 2025

ggml-quants : fix make_qp_quants NANs and IQ1 assertion errors (llama/15379)

a575f57

compilade commited on Aug 18, 2025

vulkan: disable spirv-opt for bfloat16 shaders (llama/15352)

cf24af7

jeffbolznv commited on Aug 18, 2025

vulkan: Use larger workgroups for mul_mat_vec when M is small (llama/15355)

054584a

jeffbolznv

OccamRazor commited on Aug 17, 2025

vulkan: support sqrt (llama/15370)

e5406c0

Dong Won Kim commited on Aug 17, 2025

vulkan: Optimize argsort (llama/15354)

80a188c

jeffbolznv commited on Aug 17, 2025

vulkan: fuse adds (llama/15252)

ad199b1

jeffbolznv commited on Aug 16, 2025

vulkan: Support mul_mat_id with f32 accumulators (llama/15337)

41a76e6

jeffbolznv commited on Aug 16, 2025

vulkan: Add missing bounds checking to scalar/coopmat1 mul_mat_id (llama/15334)

a6fa78e

jeffbolznv commited on Aug 16, 2025

OpenCL: add initial FA support (llama/14987)

8ece1ee

mrfatso commited on Aug 16, 2025

opencl: add initial mxfp4 support via mv (llama/15270)

1a0281c

lhez shawngu-quic commited on Aug 15, 2025

vulkan : fix out-of-bounds access in argmax kernel (llama/15342)

78a1865

ggerganov commited on Aug 15, 2025

vulkan : fix compile warnings on macos (llama/15340)

e3107ff

ggerganov commited on Aug 15, 2025

ggml: initial IBM zDNN backend (llama/14975)

449e1a4

taronaeo commited on Aug 15, 2025

CUDA: fix negative KV_max values in FA (llama/15321)

6e3a7b6

JohannesGaessler commited on Aug 14, 2025

HIP: Cleanup hipification header (llama/15285)

7cdf9cd

uvos

JohannesGaessler commited on Aug 14, 2025

vulkan: perf_logger improvements (llama/15246)

d48d508

jeffbolznv commited on Aug 14, 2025

ggml: fix ggml_conv_1d_dw bug (ggml/1323)

4496862

jasonni2 commited on Aug 14, 2025

cuda : fix GGML_CUDA_GRAPHS=OFF (llama/15300)

59c694d

Sigbjørn Skjæret commited on Aug 14, 2025

finetune: SGD optimizer, more CLI args (llama/13873)

f585fe7

Jonathan Graehl

OccamRazor

JohannesGaessler commited on Aug 14, 2025

HIP: bump requirement to rocm 6.1 (llama/15296)

58a3802

uvos commited on Aug 13, 2025

ggml : update `ggml_rope_multi` (llama/12665)

b4896dc

Judd

ggerganov commited on Aug 13, 2025

ggml : repack block_iq4_nlx8 (llama/14904)

db4407f

ggerganov commited on Aug 13, 2025

CUDA: Optimize `reduce_rows_f32` kernel, leading up to 25x perf improvement on kernel-level and 10% perf increase for Gemma3n (llama/15132)

c768824

ORippler commited on Aug 13, 2025

ggml-rpc: chunk send()/recv() to avoid EINVAL for very large tensors over RPC (macOS & others) (llama/15188)

c8284f2

aixsatoshi Shinnosuke Takagi commited on Aug 13, 2025

HIP: disable sync warp shuffel operators from clr amd_warp_sync_functions.h (llama/15273)

8fca6dd

uvos commited on Aug 12, 2025

sycl: Fix and disable more configurations of mul_mat (llama/15151)

7b868ed

Romain Biessy commited on Aug 12, 2025

opencl: allow mixed f16/f32 `add` (llama/15140)

345810b

mrfatso commited on Aug 12, 2025

CUDA cmake: add `-lineinfo` for easier debug (llama/15260)

008e169

am17an commited on Aug 12, 2025

CANN: GGML_OP_CPY optimization (llama/15070)

73e90ff

Chenguang Li commited on Aug 12, 2025

musa: fix failures in test-backend-ops for mul_mat_id op (llama/15236)

4168dda

yeahdongcn commited on Aug 12, 2025

CANN: Add broadcast for softmax and FA (llama/15208)

db87c9d

hipudding commited on Aug 11, 2025

kleidiai: fix unsigned overflow bug (llama/15150)

9d5f58c

Charles Xu commited on Aug 11, 2025

cuda: refactored ssm_scan and use CUB (llama/13291)

7a187d1

David Zhao commited on Aug 9, 2025

Commit History

Unified return format e408c7b

fix cmd 6c8a230

fix cmd a1a7aac

fix build 1b9b118

fix build d937c3a

fix dockerfile path 03ff3a5

add meta 36ff0ea

chore: track binaries with git-lfs aa000f7

chore: track binaries with git-lfs f33d63d

add sync task 46ebeba

Handle negative value in padding (#3389) 6e115ac unverified

models : update`./models/download-ggml-model.cmd` to allow for tdrz download (#3381) 0b65831 unverified

talk-llama : sync llama.cpp 4321600

sync : ggml a0af6fc

ggml: Add initial WebGPU backend (llama/14521) 4b3da1d

ggml : initial zDNN backend (llama/14975) 6dd510c

common : handle mxfp4 enum fd4c0e1

ggml-quants : fix make_qp_quants NANs and IQ1 assertion errors (llama/15379) a575f57

vulkan: disable spirv-opt for bfloat16 shaders (llama/15352) cf24af7

vulkan: Use larger workgroups for mul_mat_vec when M is small (llama/15355) 054584a

vulkan: support sqrt (llama/15370) e5406c0

vulkan: Optimize argsort (llama/15354) 80a188c

vulkan: fuse adds (llama/15252) ad199b1

vulkan: Support mul_mat_id with f32 accumulators (llama/15337) 41a76e6

vulkan: Add missing bounds checking to scalar/coopmat1 mul_mat_id (llama/15334) a6fa78e

OpenCL: add initial FA support (llama/14987) 8ece1ee

opencl: add initial mxfp4 support via mv (llama/15270) 1a0281c

vulkan : fix out-of-bounds access in argmax kernel (llama/15342) 78a1865

vulkan : fix compile warnings on macos (llama/15340) e3107ff

ggml: initial IBM zDNN backend (llama/14975) 449e1a4

CUDA: fix negative KV_max values in FA (llama/15321) 6e3a7b6

HIP: Cleanup hipification header (llama/15285) 7cdf9cd

vulkan: perf_logger improvements (llama/15246) d48d508

ggml: fix ggml_conv_1d_dw bug (ggml/1323) 4496862

cuda : fix GGML_CUDA_GRAPHS=OFF (llama/15300) 59c694d

finetune: SGD optimizer, more CLI args (llama/13873) f585fe7

HIP: bump requirement to rocm 6.1 (llama/15296) 58a3802

ggml : update `ggml_rope_multi` (llama/12665) b4896dc

ggml : repack block_iq4_nlx8 (llama/14904) db4407f

CUDA: Optimize `reduce_rows_f32` kernel, leading up to 25x perf improvement on kernel-level and 10% perf increase for Gemma3n (llama/15132) c768824

ggml-rpc: chunk send()/recv() to avoid EINVAL for very large tensors over RPC (macOS & others) (llama/15188) c8284f2

HIP: disable sync warp shuffel operators from clr amd_warp_sync_functions.h (llama/15273) 8fca6dd

sycl: Fix and disable more configurations of mul_mat (llama/15151) 7b868ed

opencl: allow mixed f16/f32 `add` (llama/15140) 345810b

CUDA cmake: add `-lineinfo` for easier debug (llama/15260) 008e169

CANN: GGML_OP_CPY optimization (llama/15070) 73e90ff

musa: fix failures in test-backend-ops for mul_mat_id op (llama/15236) 4168dda

CANN: Add broadcast for softmax and FA (llama/15208) db87c9d

kleidiai: fix unsigned overflow bug (llama/15150) 9d5f58c

cuda: refactored ssm_scan and use CUB (llama/13291) 7a187d1

Unified return format

e408c7b

fix cmd

6c8a230

fix cmd

a1a7aac

fix build

1b9b118

fix build

d937c3a

fix dockerfile path

03ff3a5

add meta

36ff0ea

chore: track binaries with git-lfs

aa000f7

chore: track binaries with git-lfs

f33d63d

add sync task

46ebeba

Handle negative value in padding (#3389)

6e115ac
unverified

models : update`./models/download-ggml-model.cmd` to allow for tdrz download (#3381)

0b65831
unverified

talk-llama : sync llama.cpp

4321600

sync : ggml

a0af6fc

ggml: Add initial WebGPU backend (llama/14521)

4b3da1d

ggml : initial zDNN backend (llama/14975)

6dd510c

common : handle mxfp4 enum

fd4c0e1

ggml-quants : fix make_qp_quants NANs and IQ1 assertion errors (llama/15379)

a575f57

vulkan: disable spirv-opt for bfloat16 shaders (llama/15352)

cf24af7

vulkan: Use larger workgroups for mul_mat_vec when M is small (llama/15355)

054584a

vulkan: support sqrt (llama/15370)

e5406c0

vulkan: Optimize argsort (llama/15354)

80a188c

vulkan: fuse adds (llama/15252)

ad199b1

vulkan: Support mul_mat_id with f32 accumulators (llama/15337)

41a76e6

vulkan: Add missing bounds checking to scalar/coopmat1 mul_mat_id (llama/15334)

a6fa78e

OpenCL: add initial FA support (llama/14987)

8ece1ee

opencl: add initial mxfp4 support via mv (llama/15270)

1a0281c

vulkan : fix out-of-bounds access in argmax kernel (llama/15342)

78a1865

vulkan : fix compile warnings on macos (llama/15340)

e3107ff

ggml: initial IBM zDNN backend (llama/14975)

449e1a4

CUDA: fix negative KV_max values in FA (llama/15321)

6e3a7b6

HIP: Cleanup hipification header (llama/15285)

7cdf9cd

vulkan: perf_logger improvements (llama/15246)

d48d508

ggml: fix ggml_conv_1d_dw bug (ggml/1323)

4496862

cuda : fix GGML_CUDA_GRAPHS=OFF (llama/15300)

59c694d

finetune: SGD optimizer, more CLI args (llama/13873)

f585fe7

HIP: bump requirement to rocm 6.1 (llama/15296)

58a3802

ggml : update `ggml_rope_multi` (llama/12665)

b4896dc

ggml : repack block_iq4_nlx8 (llama/14904)

db4407f

CUDA: Optimize `reduce_rows_f32` kernel, leading up to 25x perf improvement on kernel-level and 10% perf increase for Gemma3n (llama/15132)

c768824

ggml-rpc: chunk send()/recv() to avoid EINVAL for very large tensors over RPC (macOS & others) (llama/15188)

c8284f2

HIP: disable sync warp shuffel operators from clr amd_warp_sync_functions.h (llama/15273)

8fca6dd

sycl: Fix and disable more configurations of mul_mat (llama/15151)

7b868ed

opencl: allow mixed f16/f32 `add` (llama/15140)

345810b

CUDA cmake: add `-lineinfo` for easier debug (llama/15260)

008e169

CANN: GGML_OP_CPY optimization (llama/15070)

73e90ff

musa: fix failures in test-backend-ops for mul_mat_id op (llama/15236)

4168dda

CANN: Add broadcast for softmax and FA (llama/15208)

db87c9d

kleidiai: fix unsigned overflow bug (llama/15150)

9d5f58c

cuda: refactored ssm_scan and use CUB (llama/13291)

7a187d1