Commits · natasa365/whisper.cpp

Piotr Kubaj commited on Apr 9

Chenguang Li commited on Apr 9

jeffbolznv commited on Apr 9

jeffbolznv commited on Apr 9

Sigbjørn Skjæret commited on Apr 8

ggerganov commited on Apr 8

cmdr2 commited on Apr 10

Diego Devesa commited on Apr 9

Diego Devesa commited on Apr 9

Neo Zhang Jianyu commited on Apr 8

lhez commited on Apr 7

ggerganov commited on Apr 7

jeffzhou2000 commited on Apr 7

jeffzhou2000 commited on Apr 7

hipudding commited on Apr 7

R0CKSTAR commited on Apr 6

jeffbolznv commited on Apr 6

jeffbolznv commited on Apr 6

OccamRazor commited on Apr 5

Nicolò Scipione commited on Apr 4

Ronny Brendel commited on Apr 4

jeffbolznv commited on Apr 4

jeffbolznv commited on Apr 4

jeffbolznv commited on Apr 3

a3sh commited on Apr 3

Chenguang Li

noemotiovon commited on Apr 3

Alan Gray slaren commited on Apr 3

hipudding commited on Apr 3

lhez commited on Apr 3

jeffbolznv commited on Apr 2

bandoti commited on Apr 2

jeffbolznv commited on Apr 2

OccamRazor commited on Apr 2

Diego Devesa commited on Apr 2

ggerganov commited on Apr 7

Sigbjørn Skjæret commited on Apr 4

danbev commited on Apr 23

danbev commited on Apr 22

danbev commited on Apr 20

KitaitiMakoto commited on Apr 17

sachaarbonel commited on Apr 16

fujimotos commited on Apr 15

KitaitiMakoto commited on Apr 14

Jeff Klassen commited on Apr 11

Lin Xiaodong linxiaodong commited on Apr 11

ggerganov commited on Apr 11

fujimotos commited on Apr 10

Ekaitz Zárraga commited on Apr 9

danbev commited on Apr 9

Commit History

ggml-impl.h: fix build on POWER9 (llama/12855) 3a1d5ca

CANN: Support Opt CONV_TRANSPOSE_1D and ELU (llama/12786) 3b46fdc

vulkan: In coopmat2 mmq, load q4_k/q5_k scales through shared memory (llama/12833) 4b7a407

vulkan: Use fp16 for the flash attention P*V multiplication (llama/12783) 4e46f41

cuda : add f32 to bf16 copy op (llama/12806) 9dcb047

llama : fix FA when KV cache is not used (i.e. embeddings) (llama/12825) e7cb2dc

ggml: don't include arm_neon.h when using CUDA 12 with ARM Neon (ggml/1187) 87f1ea3

ggml : add bilinear upscale support (ggml/1185) 4c5e449

ggml : add more generic custom op, remove deprecated custom ops (ggml/1183) ba7a5f8

Revert "sycl:remove redundant memcopy in function ggml_backend_sycl_buffer_set_tensor" (llama/12812) 3d4b079

opencl: better identify Adreno GPU (llama/12760) 5560cd6

cuda : fix HIP and MUSA BF16 (llama/0) 6dc5583

sycl: remove redundant memcopy in function ggml_backend_sycl_buffer_set_tensor (llama/12734) 7d3e668

CANN: fix typo in ggml-cann (llama/12733) 65ced74

CANN: Refactor to reduce duplicate code (llama/12731) 44ac81c

musa: fix compilation warnings in mp_22/31 (llama/12780) 090ad80

vulkan: fix NaN issue in flash attention shader (llama/12776) 77d7613

vulkan: Use unclamped loads for flash attention mask (llama/12720) a76ef69

Vulkan: Tune Vulkan mmq int dot shader for performance (llama/12767) b3bf710

sycl: allow ggml-sycl configuration and compilation using Visual Studio project/solution (llama/12625) 27cbcc9

cmake: fix ggml-shaders-gen compiler paths containing spaces (llama/12747) 1c89b7d

vulkan: Hybrid waitForFences/getFenceStatus to reduce fence latency (llama/12630) ee422be

vulkan: set cmake minimum and project name in vulkan-shaders (llama/12744) 2459781

CUDA: Prefer vector flash decoding kernel for Gemma models (llama/12738) 5d7a13f

vulkan: Fix missing cmake logic for dot product extension (llama/12721) 7a1e8f8

fix MUSA compiler warning (llama/12704) 8d43aa6

CANN: Support operator SIN COS ARGMAX (llama/12709) 904aaf5

Simplify and improve CUDA graphs through use of indirect copy pointers (llama/9017) a2fdbe6

CANN: Fix failed test cases (llama/12708) 7d5f3d4

opencl: use `max_alloc_size` in backend ctx instead of querying again (llama/12705) 3847456

vulkan: Implement split_k for coopmat2 flash attention. (llama/12627) 5ab06d6

cmake: remove caching from vulkan coopmat checks (llama/12719) fac18c1

vulkan: Implement grouped query attention in the coopmat2 FA shader (llama/12559) e7bebe6

Vulkan: Fix mmq int dot float cache size (llama/12722) 1cecf5d

llama : add option to override model tensor buffers (llama/11397) 3d000b6

ggml : simplify Arm fp16 CPU logic (ggml/1177) fb13b88

CUDA: don't convert BF16 weights to FP32 (ggml/1174) 332bcaf

coreml : set convert_to="mlprogram" in convert d41b883 unverified

ci : disable freeBSD job in build.yml (#3064) 9374466 unverified

examples : add HEAPU8 to exported runtime methods (#3062) 2339555 unverified

ruby : make Ruby bindings installed with build options (#3056) 8d0a50d unverified

whisper : add no_context parameter to whisper_params (#3045) 0e991f8 unverified

examples : add FFmpeg v7.0 support to ffmpeg-transcode.cpp (#3038) 880d905 unverified

ruby: use CMake in build process (#3043) 470918e unverified

docs : update README.md to note newer nvidia gpus (#3031) 9401dde unverified

addon.node : support max_context api for addon.node (#3025) 6c51a9b unverified

whisper : reduce delta_min from 1000ms to 100ms (#3028) d3e767a unverified

docs : document how to use 'WHISPER_FFMPEG' build option (#3029) aa64fa0 unverified

docs : fix README.md (#3024) db55b1e unverified

xcf : use check for visionos build version (#3021) 919663c unverified

ggml-impl.h: fix build on POWER9 (llama/12855)

3a1d5ca

CANN: Support Opt CONV_TRANSPOSE_1D and ELU (llama/12786)

3b46fdc

vulkan: In coopmat2 mmq, load q4_k/q5_k scales through shared memory (llama/12833)

4b7a407

vulkan: Use fp16 for the flash attention P*V multiplication (llama/12783)

4e46f41

cuda : add f32 to bf16 copy op (llama/12806)

9dcb047

llama : fix FA when KV cache is not used (i.e. embeddings) (llama/12825)

e7cb2dc

ggml: don't include arm_neon.h when using CUDA 12 with ARM Neon (ggml/1187)

87f1ea3

ggml : add bilinear upscale support (ggml/1185)

4c5e449

ggml : add more generic custom op, remove deprecated custom ops (ggml/1183)

ba7a5f8

Revert "sycl:remove redundant memcopy in function ggml_backend_sycl_buffer_set_tensor" (llama/12812)

3d4b079

opencl: better identify Adreno GPU (llama/12760)

5560cd6

cuda : fix HIP and MUSA BF16 (llama/0)

6dc5583

sycl: remove redundant memcopy in function ggml_backend_sycl_buffer_set_tensor (llama/12734)

7d3e668

CANN: fix typo in ggml-cann (llama/12733)

65ced74

CANN: Refactor to reduce duplicate code (llama/12731)

44ac81c

musa: fix compilation warnings in mp_22/31 (llama/12780)

090ad80

vulkan: fix NaN issue in flash attention shader (llama/12776)

77d7613

vulkan: Use unclamped loads for flash attention mask (llama/12720)

a76ef69

Vulkan: Tune Vulkan mmq int dot shader for performance (llama/12767)

b3bf710

sycl: allow ggml-sycl configuration and compilation using Visual Studio project/solution (llama/12625)

27cbcc9

cmake: fix ggml-shaders-gen compiler paths containing spaces (llama/12747)

1c89b7d

vulkan: Hybrid waitForFences/getFenceStatus to reduce fence latency (llama/12630)

ee422be

vulkan: set cmake minimum and project name in vulkan-shaders (llama/12744)

2459781

CUDA: Prefer vector flash decoding kernel for Gemma models (llama/12738)

5d7a13f

vulkan: Fix missing cmake logic for dot product extension (llama/12721)

7a1e8f8

fix MUSA compiler warning (llama/12704)

8d43aa6

CANN: Support operator SIN COS ARGMAX (llama/12709)

904aaf5

Simplify and improve CUDA graphs through use of indirect copy pointers (llama/9017)

a2fdbe6

CANN: Fix failed test cases (llama/12708)

7d5f3d4

opencl: use `max_alloc_size` in backend ctx instead of querying again (llama/12705)

3847456

vulkan: Implement split_k for coopmat2 flash attention. (llama/12627)

5ab06d6

cmake: remove caching from vulkan coopmat checks (llama/12719)

fac18c1

vulkan: Implement grouped query attention in the coopmat2 FA shader (llama/12559)

e7bebe6

Vulkan: Fix mmq int dot float cache size (llama/12722)

1cecf5d

llama : add option to override model tensor buffers (llama/11397)

3d000b6

ggml : simplify Arm fp16 CPU logic (ggml/1177)

fb13b88

CUDA: don't convert BF16 weights to FP32 (ggml/1174)

332bcaf

coreml : set convert_to="mlprogram" in convert

d41b883
unverified

ci : disable freeBSD job in build.yml (#3064)

9374466
unverified

examples : add HEAPU8 to exported runtime methods (#3062)

2339555
unverified

ruby : make Ruby bindings installed with build options (#3056)

8d0a50d
unverified

whisper : add no_context parameter to whisper_params (#3045)

0e991f8
unverified

examples : add FFmpeg v7.0 support to ffmpeg-transcode.cpp (#3038)

880d905
unverified

ruby: use CMake in build process (#3043)

470918e
unverified

docs : update README.md to note newer nvidia gpus (#3031)

9401dde
unverified

addon.node : support max_context api for addon.node (#3025)

6c51a9b
unverified

whisper : reduce delta_min from 1000ms to 100ms (#3028)

d3e767a
unverified

docs : document how to use 'WHISPER_FFMPEG' build option (#3029)

aa64fa0
unverified

docs : fix README.md (#3024)

db55b1e
unverified

xcf : use check for visionos build version (#3021)

919663c
unverified