Update README.md
Browse files
README.md
CHANGED
|
@@ -188,10 +188,6 @@ print(f"Peak Memory Usage: {mem:.02f} GB")
|
|
| 188 |
|
| 189 |
Note the result of latency (benchmark_latency) is in seconds, and serving (benchmark_serving) is in number of requests per second.
|
| 190 |
|
| 191 |
-
## Download dataset
|
| 192 |
-
Download sharegpt dataset: `wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json`
|
| 193 |
-
|
| 194 |
-
Other datasets can be found in: https://github.com/vllm-project/vllm/tree/main/benchmarks
|
| 195 |
## benchmark_latency
|
| 196 |
|
| 197 |
Need to install vllm nightly to get some recent changes
|
|
@@ -199,7 +195,12 @@ Need to install vllm nightly to get some recent changes
|
|
| 199 |
pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
|
| 200 |
```
|
| 201 |
|
| 202 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 203 |
|
| 204 |
### baseline
|
| 205 |
```
|
|
@@ -215,7 +216,15 @@ python benchmarks/benchmark_latency.py --input-len 256 --output-len 256 --model
|
|
| 215 |
|
| 216 |
We also benchmarked the throughput in a serving environment.
|
| 217 |
|
| 218 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 219 |
|
| 220 |
### baseline
|
| 221 |
Server:
|
|
|
|
| 188 |
|
| 189 |
Note the result of latency (benchmark_latency) is in seconds, and serving (benchmark_serving) is in number of requests per second.
|
| 190 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 191 |
## benchmark_latency
|
| 192 |
|
| 193 |
Need to install vllm nightly to get some recent changes
|
|
|
|
| 195 |
pip install vllm --pre --extra-index-url https://wheels.vllm.ai/nightly
|
| 196 |
```
|
| 197 |
|
| 198 |
+
Get vllm source code:
|
| 199 |
+
```
|
| 200 |
+
git clone [email protected]:vllm-project/vllm.git
|
| 201 |
+
```
|
| 202 |
+
|
| 203 |
+
Run the following under `vllm` root folder:
|
| 204 |
|
| 205 |
### baseline
|
| 206 |
```
|
|
|
|
| 216 |
|
| 217 |
We also benchmarked the throughput in a serving environment.
|
| 218 |
|
| 219 |
+
Download sharegpt dataset: `wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json`
|
| 220 |
+
Other datasets can be found in: https://github.com/vllm-project/vllm/tree/main/benchmarks
|
| 221 |
+
|
| 222 |
+
Get vllm source code:
|
| 223 |
+
```
|
| 224 |
+
git clone [email protected]:vllm-project/vllm.git
|
| 225 |
+
```
|
| 226 |
+
|
| 227 |
+
Run the following under `vllm` root folder:
|
| 228 |
|
| 229 |
### baseline
|
| 230 |
Server:
|