Update README.md
Browse files
README.md
CHANGED
@@ -223,7 +223,7 @@ evalplus.evaluate \
|
|
223 |
## Inference Performance
|
224 |
|
225 |
|
226 |
-
This model achieves up to
|
227 |
The following performance benchmarks were conducted with [vLLM](https://docs.vllm.ai/en/latest/) version 0.6.6.post1, and [GuideLLM](https://github.com/neuralmagic/guidellm).
|
228 |
|
229 |
### Single-stream performance (measured with vLLM version 0.6.6.post1)
|
|
|
223 |
## Inference Performance
|
224 |
|
225 |
|
226 |
+
This model achieves up to 2.7x speedup in single-stream deployment and up to 1.5x speedup in multi-stream asynchronous deployment, depending on hardware and use-case scenario.
|
227 |
The following performance benchmarks were conducted with [vLLM](https://docs.vllm.ai/en/latest/) version 0.6.6.post1, and [GuideLLM](https://github.com/neuralmagic/guidellm).
|
228 |
|
229 |
### Single-stream performance (measured with vLLM version 0.6.6.post1)
|