Update README.md (#7)
Browse files- Update README.md (f97bbd620972efd5f2c4d22652c2bbde29cd7746)
Co-authored-by: He Huang <[email protected]>
README.md
CHANGED
|
@@ -402,7 +402,7 @@ The model outputs the transcribed/translated text corresponding to the input aud
|
|
| 402 |
## Training
|
| 403 |
|
| 404 |
Canary-1B is trained using the NVIDIA NeMo toolkit [4] for 150k steps with dynamic bucketing and a batch duration of 360s per GPU on 128 NVIDIA A100 80GB GPUs.
|
| 405 |
-
The model can be trained using this [example script](https://github.com/NVIDIA/NeMo/blob/
|
| 406 |
|
| 407 |
The tokenizers for these models were built using the text transcripts of the train set with this [script](https://github.com/NVIDIA/NeMo/blob/main/scripts/tokenizers/process_asr_text_tokenizer.py).
|
| 408 |
|
|
|
|
| 402 |
## Training
|
| 403 |
|
| 404 |
Canary-1B is trained using the NVIDIA NeMo toolkit [4] for 150k steps with dynamic bucketing and a batch duration of 360s per GPU on 128 NVIDIA A100 80GB GPUs.
|
| 405 |
+
The model can be trained using this [example script](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/speech_multitask/speech_to_text_aed.py) and [base config](https://github.com/NVIDIA/NeMo/blob/main/examples/asr/conf/speech_multitask/fast-conformer_aed.yaml).
|
| 406 |
|
| 407 |
The tokenizers for these models were built using the text transcripts of the train set with this [script](https://github.com/NVIDIA/NeMo/blob/main/scripts/tokenizers/process_asr_text_tokenizer.py).
|
| 408 |
|