Major update to README.md for model card
Browse files
README.md
CHANGED
@@ -11,31 +11,30 @@ tags:
|
|
11 |
- loss:ContrastiveLoss
|
12 |
base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
|
13 |
widget:
|
14 |
-
- source_sentence:
|
15 |
sentences:
|
16 |
-
-
|
17 |
-
-
|
18 |
-
-
|
19 |
-
- source_sentence:
|
20 |
sentences:
|
21 |
-
-
|
22 |
-
-
|
23 |
-
-
|
24 |
-
- source_sentence:
|
25 |
sentences:
|
26 |
-
-
|
27 |
-
-
|
28 |
-
|
29 |
-
- source_sentence: ויליאם בלייר
|
30 |
sentences:
|
31 |
-
-
|
32 |
-
-
|
33 |
-
-
|
34 |
-
- source_sentence:
|
35 |
sentences:
|
36 |
-
-
|
37 |
-
-
|
38 |
-
-
|
39 |
pipeline_tag: sentence-similarity
|
40 |
library_name: sentence-transformers
|
41 |
metrics:
|
@@ -48,7 +47,7 @@ metrics:
|
|
48 |
- cosine_ap
|
49 |
- cosine_mcc
|
50 |
model-index:
|
51 |
-
- name:
|
52 |
results:
|
53 |
- task:
|
54 |
type: binary-classification
|
@@ -83,9 +82,9 @@ model-index:
|
|
83 |
name: Cosine Mcc
|
84 |
---
|
85 |
|
86 |
-
#
|
87 |
|
88 |
-
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used
|
89 |
|
90 |
## Model Details
|
91 |
|
@@ -101,9 +100,9 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [s
|
|
101 |
|
102 |
### Model Sources
|
103 |
|
104 |
-
- **Documentation:** [
|
105 |
-
- **Repository:** [
|
106 |
-
- **Hugging Face:** [
|
107 |
|
108 |
### Full Model Architecture
|
109 |
|
@@ -129,14 +128,15 @@ Then you can load this model and run inference.
|
|
129 |
from sentence_transformers import SentenceTransformer
|
130 |
|
131 |
# Download from the 🤗 Hub
|
132 |
-
model = SentenceTransformer("
|
133 |
-
|
134 |
-
|
135 |
-
|
136 |
-
|
137 |
-
|
138 |
]
|
139 |
-
|
|
|
140 |
print(embeddings.shape)
|
141 |
# [3, 384]
|
142 |
|
@@ -144,6 +144,11 @@ print(embeddings.shape)
|
|
144 |
similarities = model.similarity(embeddings, embeddings)
|
145 |
print(similarities.shape)
|
146 |
# [3, 3]
|
|
|
|
|
|
|
|
|
|
|
147 |
```
|
148 |
|
149 |
<!--
|
@@ -157,7 +162,7 @@ print(similarities.shape)
|
|
157 |
<!--
|
158 |
### Downstream Usage (Sentence Transformers)
|
159 |
|
160 |
-
You can
|
161 |
|
162 |
<details><summary>Click to expand</summary>
|
163 |
|
|
|
11 |
- loss:ContrastiveLoss
|
12 |
base_model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
|
13 |
widget:
|
14 |
+
- source_sentence: Russell Jurney
|
15 |
sentences:
|
16 |
+
- Russell H. Jurney
|
17 |
+
- Russ Jurney
|
18 |
+
- Русс Джерни
|
19 |
+
- source_sentence: Ben Lorica
|
20 |
sentences:
|
21 |
+
- Benjamin Lorica
|
22 |
+
- 罗瑞卡
|
23 |
+
- 罗睿姬
|
24 |
+
- source_sentence: Yevgeny Prigozhin
|
25 |
sentences:
|
26 |
+
- Евге́ний Ви́кторович Приго́жин
|
27 |
+
- Y. Prighozhin
|
28 |
+
- source_sentence: M.R. James
|
|
|
29 |
sentences:
|
30 |
+
- Montague Rhodes James
|
31 |
+
- J.R. James
|
32 |
+
- Mr. James
|
33 |
+
- source_sentence: Muhammad Ali
|
34 |
sentences:
|
35 |
+
- مُحَمَّد عَلِيّ
|
36 |
+
- Mohammed Ali
|
37 |
+
- Sonny Liston
|
38 |
pipeline_tag: sentence-similarity
|
39 |
library_name: sentence-transformers
|
40 |
metrics:
|
|
|
47 |
- cosine_ap
|
48 |
- cosine_mcc
|
49 |
model-index:
|
50 |
+
- name: Graphlet-AI/eridu
|
51 |
results:
|
52 |
- task:
|
53 |
type: binary-classification
|
|
|
82 |
name: Cosine Mcc
|
83 |
---
|
84 |
|
85 |
+
# Graphlet-AI/eridu
|
86 |
|
87 |
+
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) for person and company name matching using the [Open Sanctions matcher training data](https://www.opensanctions.org/docs/pairs/). It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used as part of a deep, fuzzy entity resolution process.
|
88 |
|
89 |
## Model Details
|
90 |
|
|
|
100 |
|
101 |
### Model Sources
|
102 |
|
103 |
+
- **Documentation:** [Graphlet-AI/eridu Documentation](https://github.com/Graphlet-AI/eridu)
|
104 |
+
- **Repository:** [Graphlet-AI/eridu on GitHub](https://github.com/Graphlet-AI/eridu)
|
105 |
+
- **Hugging Face:** [Graphlet-AI/eridu on Hugging Face](https://huggingface.co/Graphlet-AI/eridu)
|
106 |
|
107 |
### Full Model Architecture
|
108 |
|
|
|
128 |
from sentence_transformers import SentenceTransformer
|
129 |
|
130 |
# Download from the 🤗 Hub
|
131 |
+
model = SentenceTransformer("Graphlet-AI/eridu")
|
132 |
+
|
133 |
+
names = [
|
134 |
+
"Russell Jurney",
|
135 |
+
"Russ Jurney",
|
136 |
+
"Русс Джерни",
|
137 |
]
|
138 |
+
|
139 |
+
embeddings = model.encode(names)
|
140 |
print(embeddings.shape)
|
141 |
# [3, 384]
|
142 |
|
|
|
144 |
similarities = model.similarity(embeddings, embeddings)
|
145 |
print(similarities.shape)
|
146 |
# [3, 3]
|
147 |
+
|
148 |
+
print(similarities.numpy())
|
149 |
+
# [[0.9999999 0.99406826 0.99406105]
|
150 |
+
# [0.9940683 1. 0.9969202 ]
|
151 |
+
# [0.99406105 0.9969202 1. ]]
|
152 |
```
|
153 |
|
154 |
<!--
|
|
|
162 |
<!--
|
163 |
### Downstream Usage (Sentence Transformers)
|
164 |
|
165 |
+
You can fine-tune this model on your own dataset.
|
166 |
|
167 |
<details><summary>Click to expand</summary>
|
168 |
|