Update README.md
Browse files
README.md
CHANGED
|
@@ -17,17 +17,11 @@ This model is a BERT based Myanmar pre-trained language model.
|
|
| 17 |
MyanBERTa was pre-trained for 528K steps on a word segmented Myanmar dataset consisting of 5,992,299 sentences (136M words).
|
| 18 |
As the tokenizer, byte-leve BPE tokenizer of 30,522 subword units which is learned after word segmentation is applied.
|
| 19 |
|
| 20 |
-
```
|
| 21 |
-
Contributed by:
|
| 22 |
-
Aye Mya Hlaing
|
| 23 |
-
Win Pa Pa
|
| 24 |
-
```
|
| 25 |
-
|
| 26 |
Cite this work as:
|
| 27 |
|
| 28 |
```
|
| 29 |
Aye Mya Hlaing, Win Pa Pa, "MyanBERTa: A Pre-trained Language Model For
|
| 30 |
-
Myanmar", In Proceedings of 2022 International Conference on Communication and Computer Research (ICCR2022), November 2022, Korea
|
| 31 |
```
|
| 32 |
|
| 33 |
[Download Paper](https://journal-home.s3.ap-northeast-2.amazonaws.com/site/iccr2022/abs/QOHFI-0004.pdf)
|
|
|
|
| 17 |
MyanBERTa was pre-trained for 528K steps on a word segmented Myanmar dataset consisting of 5,992,299 sentences (136M words).
|
| 18 |
As the tokenizer, byte-leve BPE tokenizer of 30,522 subword units which is learned after word segmentation is applied.
|
| 19 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 20 |
Cite this work as:
|
| 21 |
|
| 22 |
```
|
| 23 |
Aye Mya Hlaing, Win Pa Pa, "MyanBERTa: A Pre-trained Language Model For
|
| 24 |
+
Myanmar", In Proceedings of 2022 International Conference on Communication and Computer Research (ICCR2022), November 2022, Seoul, Republic of Korea
|
| 25 |
```
|
| 26 |
|
| 27 |
[Download Paper](https://journal-home.s3.ap-northeast-2.amazonaws.com/site/iccr2022/abs/QOHFI-0004.pdf)
|