Update README.md
Browse files
README.md
CHANGED
|
@@ -150,10 +150,14 @@ Below are DeciCoder's pass@1 on MultiPL HumanEval scores
|
|
| 150 |
|
| 151 |
### Runtime Benchmarks
|
| 152 |
|
| 153 |
-
|Inference Tool/Hardware |
|
| 154 |
-
|
| 155 |
-
| HF Inference Endpoints | 1,364.2 | 3,244.4
|
| 156 |
-
| Infery LLM | 3,889.3 | 11,676.8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 157 |
|
| 158 |
## Documentation
|
| 159 |
|
|
|
|
| 150 |
|
| 151 |
### Runtime Benchmarks
|
| 152 |
|
| 153 |
+
|Inference Tool/Hardware | A10 (tokens/sec) | A10 Latency (ms)| A100 (tokens/sec) | A100 Latency (ms) |
|
| 154 |
+
|:----------|:----------|:----------|:----------|:----------|
|
| 155 |
+
| HF Inference Endpoints | 1,364.2 | 9.03 | 3,244.4 | 8.8 |
|
| 156 |
+
| Infery LLM | 3,889.3 | 3.075 | 11,676.8 | 1.729 |
|
| 157 |
+
|
| 158 |
+
>**NOTE:**
|
| 159 |
+
>- Latency - Total generation time of batch size 1 (prefill+generate)
|
| 160 |
+
>- Throughput (tokens/sec) - Measured with optimal batchsize per hardware - A10 on BS 128, A100 on BS 512
|
| 161 |
|
| 162 |
## Documentation
|
| 163 |
|