ngwlh commited on
Commit
5e2ca59
·
1 Parent(s): 4687f45

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -0
README.md CHANGED
@@ -1,3 +1,36 @@
1
  ---
 
 
 
 
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - multilingual
4
+ - en
5
+ - zh
6
  license: apache-2.0
7
  ---
8
+
9
+ # KBioXLM
10
+
11
+ The aligned corpus constructed using the knowledge-anchored method is combined with a multi task training strategy to continue training XLM-R, thus obtaining KBioXLM. It is the first multilingual biomedical pre-trained language model we know that has cross-lingual understanding capabilities in medical domain. It was introduced in the paper [KBioXLM: A Knowledge-anchored Biomedical
12
+ Multilingual Pretrained Language Model]() by geng et al. and first released in [this repository](https://github.com/ngwlh-gl/KBioXLM/tree/main)
13
+
14
+ ## Model description
15
+ KBioXLM model can be fintuned on downstream tasks. The downstream tasks here refer to biomedical cross-lingual understanding tasks, such as biomedical entity recognition, biomedical relationship extraction and biomedical text classification.
16
+
17
+ ## Usage
18
+
19
+ You can follow the prompts below to load our model parameters:
20
+
21
+ ```python
22
+ from transformers import AutoTokenizer, AutoModelForMaskedLM
23
+ model=RobertaModel.from_pretrained(pretrained_model_name_or_path='ngwlh/KBioXLM', config=self.config)
24
+ checkpoint = torch.load('ngwlh/KBioXLM'+'/pytorch_model.bin', map_location='cpu')
25
+ new_state_dict=OrderedDict()
26
+ for k,v in checkpoint.items():
27
+ if k.startswith('model.roberta.'):
28
+ k=k.replace('model.roberta.','')
29
+ new_state_dict[k]=v
30
+ msg = model.load_state_dict(new_state_dict, strict=False)
31
+ logger.info(msg)
32
+ ```
33
+
34
+ ### BibTeX entry and citation info
35
+
36
+ Coming soon.