Update README.md
Browse files
README.md
CHANGED
|
@@ -276,7 +276,7 @@ NVIDIA [NeMo Canary](https://nvidia.github.io/NeMo/blogs/2024/2024-02-canary/) i
|
|
| 276 |
|
| 277 |
Canary is an encoder-decoder model with FastConformer [1] encoder and Transformer Decoder [2].
|
| 278 |
With audio features extracted from the encoder, task tokens such as `<source language>`, `<target language>`, `<task>` and `<toggle PnC>`
|
| 279 |
-
are fed into the Transformer Decoder to trigger the text generation process. Canary uses a concatenated tokenizer from individual
|
| 280 |
SentencePiece [3] tokenizers of each language, which makes it easy to scale up to more languages.
|
| 281 |
The Canay-1B model has 24 encoder layers and 24 layers of decoder layers in total.
|
| 282 |
|
|
@@ -479,7 +479,7 @@ BLEU score on [FLEURS](https://huggingface.co/datasets/google/fleurs) test set:
|
|
| 479 |
|
| 480 |
| **Version** | **Model** | **En->De** | **En->Es** | **En->Fr** | **De->En** | **Es->En** | **Fr->En** |
|
| 481 |
|:-----------:|:---------:|:----------:|:----------:|:----------:|:----------:|:----------:|:----------:|
|
| 482 |
-
| 1.23.0 | canary-1b |
|
| 483 |
|
| 484 |
|
| 485 |
BLEU score on [COVOST-v2](https://github.com/facebookresearch/covost) test set:
|
|
@@ -518,6 +518,7 @@ Check out [Riva live demo](https://developer.nvidia.com/riva#demos).
|
|
| 518 |
|
| 519 |
[4] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
|
| 520 |
|
|
|
|
| 521 |
|
| 522 |
## Licence
|
| 523 |
|
|
|
|
| 276 |
|
| 277 |
Canary is an encoder-decoder model with FastConformer [1] encoder and Transformer Decoder [2].
|
| 278 |
With audio features extracted from the encoder, task tokens such as `<source language>`, `<target language>`, `<task>` and `<toggle PnC>`
|
| 279 |
+
are fed into the Transformer Decoder to trigger the text generation process. Canary uses a concatenated tokenizer [5] from individual
|
| 280 |
SentencePiece [3] tokenizers of each language, which makes it easy to scale up to more languages.
|
| 281 |
The Canay-1B model has 24 encoder layers and 24 layers of decoder layers in total.
|
| 282 |
|
|
|
|
| 479 |
|
| 480 |
| **Version** | **Model** | **En->De** | **En->Es** | **En->Fr** | **De->En** | **Es->En** | **Fr->En** |
|
| 481 |
|:-----------:|:---------:|:----------:|:----------:|:----------:|:----------:|:----------:|:----------:|
|
| 482 |
+
| 1.23.0 | canary-1b | 32.13 | 22.66 | 40.76 | 33.98 | 21.80 | 30.95 |
|
| 483 |
|
| 484 |
|
| 485 |
BLEU score on [COVOST-v2](https://github.com/facebookresearch/covost) test set:
|
|
|
|
| 518 |
|
| 519 |
[4] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
|
| 520 |
|
| 521 |
+
[5] [Unified Model for Code-Switching Speech Recognition and Language Identification Based on Concatenated Tokenizer](https://aclanthology.org/2023.calcs-1.7.pdf)
|
| 522 |
|
| 523 |
## Licence
|
| 524 |
|