Spaces:

natasa365
/

whisper.cpp

Sleeping

File size: 6,835 Bytes

9a168fc

# whisper.cpp for SYCL

[Background](#background)

[OS](#os)

[Intel GPU](#intel-gpu)

[Linux](#linux)

[Environment Variable](#environment-variable)

[Known Issue](#known-issue)

[Todo](#todo)

## Background

SYCL is a higher-level programming model to improve programming productivity on various hardware accelerators�such as CPUs, GPUs, and FPGAs. It is a single-source embedded domain-specific language based on pure C++17.

oneAPI is a specification that is open and standards-based, supporting multiple architecture types including but not limited to GPU, CPU, and FPGA. The spec has both direct programming and API-based programming paradigms.

Intel uses the SYCL as direct programming language to support CPU, GPUs and FPGAs.

To avoid  re-inventing the wheel, this code refers other code paths in llama.cpp (like OpenBLAS, cuBLAS, CLBlast). We use a open-source tool [SYCLomatic](https://github.com/oneapi-src/SYCLomatic) (Commercial release [Intel� DPC++ Compatibility Tool](https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compatibility-tool.html)) migrate to SYCL.

The whisper.cpp for SYCL is used to support Intel GPUs.

For Intel CPU, recommend to use whisper.cpp for X86 (Intel MKL build).

## OS

|OS|Status|Verified|
|-|-|-|
|Linux|Support|Ubuntu 22.04|
|Windows|Ongoing| |


## Intel GPU

|Intel GPU| Status | Verified Model|
|-|-|-|
|Intel Data Center Max Series| Support| Max 1550|
|Intel Data Center Flex Series| Support| Flex 170|
|Intel Arc Series| Support| Arc 770|
|Intel built-in Arc GPU| Support| built-in Arc GPU in Meteor Lake|
|Intel iGPU| Support| iGPU in i5-1250P, i7-1165G7|


## Linux

### Setup Environment

1. Install Intel GPU driver.

a. Please install Intel GPU driver by official guide: [Install GPU Drivers](https://dgpu-docs.intel.com/driver/installation.html).

Note: for iGPU, please install the client GPU driver.

b. Add user to group: video, render.

```

sudo usermod -aG render username

sudo usermod -aG video username

```

Note: re-login to enable it.

c. Check

```

sudo apt install clinfo

sudo clinfo -l

```

Output (example):

```

Platform #0: Intel(R) OpenCL Graphics

 `-- Device #0: Intel(R) Arc(TM) A770 Graphics





Platform #0: Intel(R) OpenCL HD Graphics

 `-- Device #0: Intel(R) Iris(R) Xe Graphics [0x9a49]

```

2. Install Intel� oneAPI Base toolkit.


a. Please follow the procedure in [Get the Intel� oneAPI Base Toolkit ](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit.html).

Recommend to install to default folder: **/opt/intel/oneapi**.

Following guide use the default folder as example. If you use other folder, please modify the following guide info with your folder.

b. Check

```

source /opt/intel/oneapi/setvars.sh



sycl-ls

```

There should be one or more level-zero devices. Like **[ext_oneapi_level_zero:gpu:0]**.



Output (example):

```

[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device OpenCL 1.2  [2023.16.10.0.17_160000]

[opencl:cpu:1] Intel(R) OpenCL, 13th Gen Intel(R) Core(TM) i7-13700K OpenCL 3.0 (Build 0) [2023.16.10.0.17_160000]

[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) Arc(TM) A770 Graphics OpenCL 3.0 NEO  [23.30.26918.50]

[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Arc(TM) A770 Graphics 1.3 [1.3.26918]



```



2. Build locally:



```

mkdir -p build

cd build

source /opt/intel/oneapi/setvars.sh



#for FP16

#cmake .. -DWHISPER_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx -DWHISPER_SYCL_F16=ON 



#for FP32

cmake .. -DWHISPER_SYCL=ON -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx



#build example/main only

#cmake --build . --config Release --target main



#build all binary

cmake --build . --config Release -v



```



or



```

./examples/sycl/build.sh

```



Note:



- By default, it will build for all binary files. It will take more time. To reduce the time, we recommend to build for **example/main** only.

### Run

1. Put model file to folder **models**

2. Enable oneAPI running environment

```

source /opt/intel/oneapi/setvars.sh

```

3. List device ID

Run without parameter:

```

./build/bin/ls-sycl-device



or



./build/bin/main

```

Check the ID in startup log, like:

```

found 4 SYCL devices:

  Device 0: Intel(R) Arc(TM) A770 Graphics,	compute capability 1.3,

    max compute_units 512,	max work group size 1024,	max sub group size 32,	global mem size 16225243136

  Device 1: Intel(R) FPGA Emulation Device,	compute capability 1.2,

    max compute_units 24,	max work group size 67108864,	max sub group size 64,	global mem size 67065057280

  Device 2: 13th Gen Intel(R) Core(TM) i7-13700K,	compute capability 3.0,

    max compute_units 24,	max work group size 8192,	max sub group size 64,	global mem size 67065057280

  Device 3: Intel(R) Arc(TM) A770 Graphics,	compute capability 3.0,

    max compute_units 512,	max work group size 1024,	max sub group size 32,	global mem size 16225243136



```

|Attribute|Note|
|-|-|
|compute capability 1.3|Level-zero running time, recommended |
|compute capability 3.0|OpenCL running time, slower than level-zero in most cases|

4. Set device ID and execute whisper.cpp

Set device ID = 0 by **GGML_SYCL_DEVICE=0**

```

GGML_SYCL_DEVICE=0 ./build/bin/main -m models/ggml-base.en.bin -f samples/jfk.wav

```
or run by script:

```

./examples/sycl/run_whisper.sh

```



5. Check the device ID in output

Like:
```

Using device **0** (Intel(R) Arc(TM) A770 Graphics) as main device

```


## Environment Variable

#### Build

|Name|Value|Function|
|-|-|-|
|WHISPER_SYCL|ON (mandatory)|Enable build with SYCL code path. <br>For FP32/FP16, WHISPER_SYCL=ON is mandatory.|
|WHISPER_SYCL_F16|ON (optional)|Enable FP16 build with SYCL code path.For FP32, do not set it.|
|CMAKE_C_COMPILER|icx|Use icx compiler for SYCL code path|
|CMAKE_CXX_COMPILER|icpx|use icpx for SYCL code path|

#### Running


|Name|Value|Function|
|-|-|-|
|GGML_SYCL_DEVICE|0 (default) or 1|Set the device id used. Check the device ids by default running output|
|GGML_SYCL_DEBUG|0 (default) or 1|Enable log function by macro: GGML_SYCL_DEBUG|

## Known Issue

- Error:  `error while loading shared libraries: libsycl.so.7: cannot open shared object file: No such file or directory`.

  Miss to enable oneAPI running environment.

  Install oneAPI base toolkit and enable it by: `source /opt/intel/oneapi/setvars.sh`.


- Hang during startup

  llama.cpp use mmap as default way to read model file and copy to GPU. In some system, memcpy will be abnormal and block.

  Solution: add **--no-mmap**.

## Todo

- Support to build in Windows.

- Support multiple cards.