Vulcan commited on
Commit
71b840c
·
unverified ·
1 Parent(s): 6a785f1

readme : partial OpenCL GPU support via CLBlast (#863)

Browse files

* ggml : CLBlast support as in llama.cpp

Building with CLBlast speeds up whisper.cpp ~2x on low end / older AMD APUs (CPU with integrated GPU) such as the A9.

Usage:
WHISPER_CLBLAST=1 make

* CMake/Makefile : CLBlast support as in llama.cpp

Building with CLBlast speeds up whisper.cpp ~2x on low end / older AMD APUs (CPU with integrated GPU) such as the A9.

Usage:
```
Makefile:
cd whisper.cpp
WHISPER_CLBLAST=1 make

CMake:
cd whisper.cpp ; mkdir build ; cd build
cmake -DWHISPER_CLBLAST=ON ..
make
```

* Update README.md

Added OpenCL Build Instructions

* Instruction: Partial OpenCL GPU support via CLBlast

Added build instructions and examples for Make and CMake to support OpenCL enabled GPUs.

Files changed (1) hide show
  1. README.md +24 -0
README.md CHANGED
@@ -20,6 +20,7 @@ High-performance inference of [OpenAI's Whisper](https://github.com/openai/whisp
20
  - Zero memory allocations at runtime
21
  - Runs on the CPU
22
  - [Partial GPU support for NVIDIA via cuBLAS](https://github.com/ggerganov/whisper.cpp#nvidia-gpu-support-via-cublas)
 
23
  - [C-style API](https://github.com/ggerganov/whisper.cpp/blob/master/whisper.h)
24
 
25
  Supported platforms:
@@ -311,6 +312,29 @@ make clean
311
  WHISPER_CUBLAS=1 make -j
312
  ```
313
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
314
  Run all the examples as usual.
315
 
316
  ## Limitations
 
20
  - Zero memory allocations at runtime
21
  - Runs on the CPU
22
  - [Partial GPU support for NVIDIA via cuBLAS](https://github.com/ggerganov/whisper.cpp#nvidia-gpu-support-via-cublas)
23
+ - [Partial OpenCL GPU support via CLBlast](https://github.com/ggerganov/whisper.cpp#opencl-gpu-support-via-clblast)
24
  - [C-style API](https://github.com/ggerganov/whisper.cpp/blob/master/whisper.h)
25
 
26
  Supported platforms:
 
312
  WHISPER_CUBLAS=1 make -j
313
  ```
314
 
315
+ ## OpenCL GPU support via CLBlast
316
+
317
+ For cards and integrated GPUs that support OpenCL, the Encoder processing can be largely offloaded to the GPU through CLBlast. This is especially useful for users with AMD APU's or low end devices for up to ~2x speedup.
318
+
319
+ First, make sure you have installed `CLBlast` for your OS or Distribution: https://github.com/CNugteren/CLBlast
320
+
321
+ Now build `whisper.cpp` with CLBlast support:
322
+
323
+ ```
324
+ Makefile:
325
+ cd whisper.cpp
326
+ make clean
327
+ WHISPER_CLBLAST=1 make -j
328
+
329
+ CMake:
330
+ cd whisper.cpp ; mkdir build ; cd build
331
+ cmake -DWHISPER_CLBLAST=ON ..
332
+ make clean
333
+ make -j
334
+ cp bin/* ../
335
+ ```
336
+
337
+
338
  Run all the examples as usual.
339
 
340
  ## Limitations