Readme, add infinity deployment documentation (#21)
Browse files- Update README.md (31e97fe8a364fd9c580cbe8e5432c263213afc60)
- Update README.md (13661f9e67466c5451f26990e0df3efa9090ae3e)
Co-authored-by: Michael <[email protected]>
README.md
CHANGED
|
@@ -5620,6 +5620,16 @@ scores = (embeddings[:2] @ embeddings[2:].T) * 100
|
|
| 5620 |
print(scores.tolist())
|
| 5621 |
```
|
| 5622 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 5623 |
## Evaluation
|
| 5624 |
|
| 5625 |
### MTEB & C-MTEB
|
|
|
|
| 5620 |
print(scores.tolist())
|
| 5621 |
```
|
| 5622 |
|
| 5623 |
+
### infinity_emb
|
| 5624 |
+
|
| 5625 |
+
Usage via [infinity, MIT Licensed](https://github.com/michaelfeil/infinity).
|
| 5626 |
+
```bash
|
| 5627 |
+
docker run \
|
| 5628 |
+
--gpus "0" -p "7997":"7997" \
|
| 5629 |
+
michaelf34/infinity:0.0.68-trt-onnx \
|
| 5630 |
+
v2 --model-id Alibaba-NLP/gte-Qwen2-1.5B-instruct --revision "refs/pr/20" --dtype bfloat16 --batch-size 16 --device cuda --engine torch --port 7997 --no-bettertransformer
|
| 5631 |
+
```
|
| 5632 |
+
|
| 5633 |
## Evaluation
|
| 5634 |
|
| 5635 |
### MTEB & C-MTEB
|