metadata
tags:
- espnet
- audio
- speech-recognition
language: en
datasets:
- google/fleurs
license: cc-by-4.0
ESPnet2 ASR model
espnet/wanchichen_fleurs_multilingual_asr_hubert_frontend
This model was trained by William Chen using the fleurs recipe in espnet.
Demo: How to use in ESPnet2
cd espnet
pip install -e .
cd egs2/fleurs/asr1
./run.sh
RESULTS
Environments
- date:
Sun Aug 21 15:18:30 EDT 2022 - python version:
3.8.6 (default, Dec 17 2020, 16:57:01) [GCC 10.2.0] - espnet version:
espnet 202205 - pytorch version:
pytorch 1.8.1+cu102 - Git hash:
45e8cb9173a072f85ee7a7ccbcae06af7c5c484a- Commit date:
Wed Jun 1 14:21:14 2022 +0900
- Commit date:
asr_train_asr_hubert_raw_all_raw_all_bpe6500_sp
WER
| dataset | Snt | Wrd | Corr | Sub | Del | Ins | Err | S.Err |
|---|---|---|---|---|---|---|---|---|
| decode_asr_asr_model_valid.acc.best/test_all | 77809 | 1669969 | 54.2 | 38.1 | 7.8 | 4.0 | 49.9 | 99.7 |
| decode_asr_asr_model_valid.acc.best/dev_all | 31622 | 610500 | 53.4 | 38.7 | 7.9 | 3.5 | 50.1 | 99.5 |
CER
| dataset | Snt | Wrd | Corr | Sub | Del | Ins | Err | S.Err |
|---|---|---|---|---|---|---|---|---|
| decode_asr_asr_model_valid.acc.best/test_all | 77809 | 10876592 | 85.1 | 7.4 | 7.5 | 3.2 | 18.0 | 99.7 |
| decode_asr_asr_model_valid.acc.best/dev_all | 31622 | 3988181 | 84.7 | 7.7 | 7.7 | 2.8 | 18.1 | 99.5 |
TER
| dataset | Snt | Wrd | Corr | Sub | Del | Ins | Err | S.Err |
|---|---|---|---|---|---|---|---|---|
| decode_asr_asr_model_valid.acc.best/test_all | 77809 | 9622352 | 82.8 | 9.2 | 8.0 | 3.4 | 20.6 | 99.7 |
| decode_asr_asr_model_valid.acc.best/dev_all | 31622 | 3547834 | 82.4 | 9.4 | 8.2 | 3.0 | 20.6 | 99.5 |