Files changed (4) hide show
  1. .gitattributes +0 -1
  2. README.md +0 -70
  3. model.safetensors +0 -3
  4. tokenizer_config.json +3 -1
.gitattributes CHANGED
@@ -7,4 +7,3 @@
7
  *.ot filter=lfs diff=lfs merge=lfs -text
8
  *.onnx filter=lfs diff=lfs merge=lfs -text
9
  *.msgpack filter=lfs diff=lfs merge=lfs -text
10
- model.safetensors filter=lfs diff=lfs merge=lfs -text
 
7
  *.ot filter=lfs diff=lfs merge=lfs -text
8
  *.onnx filter=lfs diff=lfs merge=lfs -text
9
  *.msgpack filter=lfs diff=lfs merge=lfs -text
 
README.md CHANGED
@@ -1,73 +1,3 @@
1
  ---
2
  language: zh
3
- license: apache-2.0
4
  ---
5
-
6
- # Bert-base-chinese
7
-
8
- ## Table of Contents
9
- - [Model Details](#model-details)
10
- - [Uses](#uses)
11
- - [Risks, Limitations and Biases](#risks-limitations-and-biases)
12
- - [Training](#training)
13
- - [Evaluation](#evaluation)
14
- - [How to Get Started With the Model](#how-to-get-started-with-the-model)
15
-
16
-
17
- ## Model Details
18
-
19
- ### Model Description
20
-
21
- This model has been pre-trained for Chinese, training and random input masking has been applied independently to word pieces (as in the original BERT paper).
22
-
23
- - **Developed by:** Google
24
- - **Model Type:** Fill-Mask
25
- - **Language(s):** Chinese
26
- - **License:** Apache 2.0
27
- - **Parent Model:** See the [BERT base uncased model](https://huggingface.co/bert-base-uncased) for more information about the BERT base model.
28
-
29
- ### Model Sources
30
- - **GitHub repo**: https://github.com/google-research/bert/blob/master/multilingual.md
31
- - **Paper:** [BERT](https://arxiv.org/abs/1810.04805)
32
-
33
-
34
- ## Uses
35
-
36
- #### Direct Use
37
-
38
- This model can be used for masked language modeling
39
-
40
-
41
-
42
- ## Risks, Limitations and Biases
43
- **CONTENT WARNING: Readers should be aware this section contains content that is disturbing, offensive, and can propagate historical and current stereotypes.**
44
-
45
- Significant research has explored bias and fairness issues with language models (see, e.g., [Sheng et al. (2021)](https://aclanthology.org/2021.acl-long.330.pdf) and [Bender et al. (2021)](https://dl.acm.org/doi/pdf/10.1145/3442188.3445922)).
46
-
47
-
48
- ## Training
49
-
50
- #### Training Procedure
51
- * **type_vocab_size:** 2
52
- * **vocab_size:** 21128
53
- * **num_hidden_layers:** 12
54
-
55
- #### Training Data
56
- [More Information Needed]
57
-
58
- ## Evaluation
59
-
60
- #### Results
61
-
62
- [More Information Needed]
63
-
64
-
65
- ## How to Get Started With the Model
66
- ```python
67
- from transformers import AutoTokenizer, AutoModelForMaskedLM
68
-
69
- tokenizer = AutoTokenizer.from_pretrained("bert-base-chinese")
70
-
71
- model = AutoModelForMaskedLM.from_pretrained("bert-base-chinese")
72
-
73
- ```
 
1
  ---
2
  language: zh
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
model.safetensors DELETED
@@ -1,3 +0,0 @@
1
- version https://git-lfs.github.com/spec/v1
2
- oid sha256:3404a1ffd8da507042e8161013ba2a4fc49858b4e3f8fbf5ce5724f94883aec3
3
- size 411553788
 
 
 
 
tokenizer_config.json CHANGED
@@ -1 +1,3 @@
1
- {"do_lower_case": false, "model_max_length": 512}
 
 
 
1
+ {
2
+ "do_lower_case": false
3
+ }