ggerganov commited on
Commit
f0cc63c
·
1 Parent(s): af4838c

Create README.md

Browse files
Files changed (1) hide show
  1. models/README.md +28 -0
models/README.md ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## Whisper model files in custom ggml format
2
+
3
+ The [original Whisper PyTorch models provided by OpenAI](https://github.com/openai/whisper/blob/main/whisper/__init__.py#L17-L27)
4
+ have been converted to custom `ggml` format in order to be able to load them in C/C++. The conversion has been performed using the
5
+ [convert-pt-to-ggml.py](convert-pt-to-ggml.py) script. You can either obtain the original models and generate the `ggml` files
6
+ yourself using the conversion script, or you can use the [download-ggml-model.sh](download-ggml-model.sh) script to download the
7
+ already converted models.
8
+
9
+ Sample usage:
10
+
11
+ ```java
12
+ $ ./download-ggml-model.sh base.en
13
+ Downloading ggml model base.en ...
14
+ models/ggml-base.en.bin 100%[=============================================>] 141.11M 5.41MB/s in 22s
15
+ Done! Model 'base.en' saved in 'models/ggml-base.en.bin'
16
+ You can now use it like this:
17
+
18
+ $ ./main -m models/ggml-base.en.bin -f samples/jfk.wav
19
+ ```
20
+
21
+ A third option to obtain the model files is to download them from Hugging Face:
22
+
23
+ https://huggingface.co/datasets/ggerganov/whisper.cpp/tree/main
24
+
25
+ ## Model files for testing purposes
26
+
27
+ The model files pefixed with `for-tests-` are empty (i.e. do not contain any weights) and are used by the CI for testing purposes.
28
+ They are directly included in this repository for convenience and the Github Actions CI uses them to run various sanitizer tests.