msekoyan commited on
Commit
f7b7a2b
·
1 Parent(s): a12ed95

add asr results to metadata

Browse files

Signed-off-by: monica-sekoyan <[email protected]>

Files changed (1) hide show
  1. README.md +633 -0
README.md CHANGED
@@ -35,6 +35,639 @@ metrics:
35
  - comet
36
  pipeline_tag: automatic-speech-recognition
37
  library_name: nemo
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  ---
39
  ## <span style="color:#ffb300;">🐤 Canary 1B v2: Multitask Speech Transcription and Translation Model </span>
40
 
 
35
  - comet
36
  pipeline_tag: automatic-speech-recognition
37
  library_name: nemo
38
+ tags:
39
+ - automatic-speech-recognition
40
+ - automatic-speech-translation
41
+ - speech
42
+ - audio
43
+ - Transformer
44
+ - FastConformer
45
+ - Conformer
46
+ - pytorch
47
+ - NeMo
48
+ - hf-asr-leaderboard
49
+ model-index:
50
+ - name: canary-1b-v2
51
+ results:
52
+ # FLEURS ASR Results
53
+ - task:
54
+ type: Automatic Speech Recognition
55
+ name: automatic-speech-recognition
56
+ dataset:
57
+ name: FLEURS
58
+ type: google/fleurs
59
+ config: bg_bg
60
+ split: test
61
+ args:
62
+ language: bg
63
+ metrics:
64
+ - name: Test WER (Bg)
65
+ type: wer
66
+ value: 9.25
67
+ - task:
68
+ type: Automatic Speech Recognition
69
+ name: automatic-speech-recognition
70
+ dataset:
71
+ name: FLEURS
72
+ type: google/fleurs
73
+ config: cs_cz
74
+ split: test
75
+ args:
76
+ language: cs
77
+ metrics:
78
+ - name: Test WER (Cs)
79
+ type: wer
80
+ value: 7.86
81
+ - task:
82
+ type: Automatic Speech Recognition
83
+ name: automatic-speech-recognition
84
+ dataset:
85
+ name: FLEURS
86
+ type: google/fleurs
87
+ config: da_dk
88
+ split: test
89
+ args:
90
+ language: da
91
+ metrics:
92
+ - name: Test WER (Da)
93
+ type: wer
94
+ value: 11.25
95
+ - task:
96
+ type: Automatic Speech Recognition
97
+ name: automatic-speech-recognition
98
+ dataset:
99
+ name: FLEURS
100
+ type: google/fleurs
101
+ config: de_de
102
+ split: test
103
+ args:
104
+ language: de
105
+ metrics:
106
+ - name: Test WER (De)
107
+ type: wer
108
+ value: 4.40
109
+ - task:
110
+ type: Automatic Speech Recognition
111
+ name: automatic-speech-recognition
112
+ dataset:
113
+ name: FLEURS
114
+ type: google/fleurs
115
+ config: el_gr
116
+ split: test
117
+ args:
118
+ language: el
119
+ metrics:
120
+ - name: Test WER (El)
121
+ type: wer
122
+ value: 9.21
123
+ - task:
124
+ type: Automatic Speech Recognition
125
+ name: automatic-speech-recognition
126
+ dataset:
127
+ name: FLEURS
128
+ type: google/fleurs
129
+ config: en_us
130
+ split: test
131
+ args:
132
+ language: en
133
+ metrics:
134
+ - name: Test WER (En)
135
+ type: wer
136
+ value: 4.50
137
+ - task:
138
+ type: Automatic Speech Recognition
139
+ name: automatic-speech-recognition
140
+ dataset:
141
+ name: FLEURS
142
+ type: google/fleurs
143
+ config: es_419
144
+ split: test
145
+ args:
146
+ language: es
147
+ metrics:
148
+ - name: Test WER (Es)
149
+ type: wer
150
+ value: 2.90
151
+ - task:
152
+ type: Automatic Speech Recognition
153
+ name: automatic-speech-recognition
154
+ dataset:
155
+ name: FLEURS
156
+ type: google/fleurs
157
+ config: et_ee
158
+ split: test
159
+ args:
160
+ language: et
161
+ metrics:
162
+ - name: Test WER (Et)
163
+ type: wer
164
+ value: 12.55
165
+ - task:
166
+ type: Automatic Speech Recognition
167
+ name: automatic-speech-recognition
168
+ dataset:
169
+ name: FLEURS
170
+ type: google/fleurs
171
+ config: fi_fi
172
+ split: test
173
+ args:
174
+ language: fi
175
+ metrics:
176
+ - name: Test WER (Fi)
177
+ type: wer
178
+ value: 8.59
179
+ - task:
180
+ type: Automatic Speech Recognition
181
+ name: automatic-speech-recognition
182
+ dataset:
183
+ name: FLEURS
184
+ type: google/fleurs
185
+ config: fr_fr
186
+ split: test
187
+ args:
188
+ language: fr
189
+ metrics:
190
+ - name: Test WER (Fr)
191
+ type: wer
192
+ value: 5.02
193
+ - task:
194
+ type: Automatic Speech Recognition
195
+ name: automatic-speech-recognition
196
+ dataset:
197
+ name: FLEURS
198
+ type: google/fleurs
199
+ config: hr_hr
200
+ split: test
201
+ args:
202
+ language: hr
203
+ metrics:
204
+ - name: Test WER (Hr)
205
+ type: wer
206
+ value: 8.29
207
+ - task:
208
+ type: Automatic Speech Recognition
209
+ name: automatic-speech-recognition
210
+ dataset:
211
+ name: FLEURS
212
+ type: google/fleurs
213
+ config: hu_hu
214
+ split: test
215
+ args:
216
+ language: hu
217
+ metrics:
218
+ - name: Test WER (Hu)
219
+ type: wer
220
+ value: 12.90
221
+ - task:
222
+ type: Automatic Speech Recognition
223
+ name: automatic-speech-recognition
224
+ dataset:
225
+ name: FLEURS
226
+ type: google/fleurs
227
+ config: it_it
228
+ split: test
229
+ args:
230
+ language: it
231
+ metrics:
232
+ - name: Test WER (It)
233
+ type: wer
234
+ value: 3.07
235
+ - task:
236
+ type: Automatic Speech Recognition
237
+ name: automatic-speech-recognition
238
+ dataset:
239
+ name: FLEURS
240
+ type: google/fleurs
241
+ config: lt_lt
242
+ split: test
243
+ args:
244
+ language: lt
245
+ metrics:
246
+ - name: Test WER (Lt)
247
+ type: wer
248
+ value: 12.36
249
+ - task:
250
+ type: Automatic Speech Recognition
251
+ name: automatic-speech-recognition
252
+ dataset:
253
+ name: FLEURS
254
+ type: google/fleurs
255
+ config: lv_lv
256
+ split: test
257
+ args:
258
+ language: lv
259
+ metrics:
260
+ - name: Test WER (Lv)
261
+ type: wer
262
+ value: 9.66
263
+ - task:
264
+ type: Automatic Speech Recognition
265
+ name: automatic-speech-recognition
266
+ dataset:
267
+ name: FLEURS
268
+ type: google/fleurs
269
+ config: mt_mt
270
+ split: test
271
+ args:
272
+ language: mt
273
+ metrics:
274
+ - name: Test WER (Mt)
275
+ type: wer
276
+ value: 18.31
277
+ - task:
278
+ type: Automatic Speech Recognition
279
+ name: automatic-speech-recognition
280
+ dataset:
281
+ name: FLEURS
282
+ type: google/fleurs
283
+ config: nl_nl
284
+ split: test
285
+ args:
286
+ language: nl
287
+ metrics:
288
+ - name: Test WER (Nl)
289
+ type: wer
290
+ value: 6.12
291
+ - task:
292
+ type: Automatic Speech Recognition
293
+ name: automatic-speech-recognition
294
+ dataset:
295
+ name: FLEURS
296
+ type: google/fleurs
297
+ config: pl_pl
298
+ split: test
299
+ args:
300
+ language: pl
301
+ metrics:
302
+ - name: Test WER (Pl)
303
+ type: wer
304
+ value: 6.64
305
+ - task:
306
+ type: Automatic Speech Recognition
307
+ name: automatic-speech-recognition
308
+ dataset:
309
+ name: FLEURS
310
+ type: google/fleurs
311
+ config: pt_br
312
+ split: test
313
+ args:
314
+ language: pt
315
+ metrics:
316
+ - name: Test WER (Pt)
317
+ type: wer
318
+ value: 4.39
319
+ - task:
320
+ type: Automatic Speech Recognition
321
+ name: automatic-speech-recognition
322
+ dataset:
323
+ name: FLEURS
324
+ type: google/fleurs
325
+ config: ro_ro
326
+ split: test
327
+ args:
328
+ language: ro
329
+ metrics:
330
+ - name: Test WER (Ro)
331
+ type: wer
332
+ value: 6.61
333
+ - task:
334
+ type: Automatic Speech Recognition
335
+ name: automatic-speech-recognition
336
+ dataset:
337
+ name: FLEURS
338
+ type: google/fleurs
339
+ config: ru_ru
340
+ split: test
341
+ args:
342
+ language: ru
343
+ metrics:
344
+ - name: Test WER (Ru)
345
+ type: wer
346
+ value: 6.90
347
+ - task:
348
+ type: Automatic Speech Recognition
349
+ name: automatic-speech-recognition
350
+ dataset:
351
+ name: FLEURS
352
+ type: google/fleurs
353
+ config: sk_sk
354
+ split: test
355
+ args:
356
+ language: sk
357
+ metrics:
358
+ - name: Test WER (Sk)
359
+ type: wer
360
+ value: 5.74
361
+ - task:
362
+ type: Automatic Speech Recognition
363
+ name: automatic-speech-recognition
364
+ dataset:
365
+ name: FLEURS
366
+ type: google/fleurs
367
+ config: sl_si
368
+ split: test
369
+ args:
370
+ language: sl
371
+ metrics:
372
+ - name: Test WER (Sl)
373
+ type: wer
374
+ value: 13.32
375
+ - task:
376
+ type: Automatic Speech Recognition
377
+ name: automatic-speech-recognition
378
+ dataset:
379
+ name: FLEURS
380
+ type: google/fleurs
381
+ config: sv_se
382
+ split: test
383
+ args:
384
+ language: sv
385
+ metrics:
386
+ - name: Test WER (Sv)
387
+ type: wer
388
+ value: 9.57
389
+ - task:
390
+ type: Automatic Speech Recognition
391
+ name: automatic-speech-recognition
392
+ dataset:
393
+ name: FLEURS
394
+ type: google/fleurs
395
+ config: uk_ua
396
+ split: test
397
+ args:
398
+ language: uk
399
+ metrics:
400
+ - name: Test WER (Uk)
401
+ type: wer
402
+ value: 10.50
403
+ # Multilingual LibriSpeech ASR Results
404
+ - task:
405
+ type: Automatic Speech Recognition
406
+ name: automatic-speech-recognition
407
+ dataset:
408
+ name: Multilingual LibriSpeech
409
+ type: facebook/multilingual_librispeech
410
+ config: spanish
411
+ split: test
412
+ args:
413
+ language: es
414
+ metrics:
415
+ - name: Test WER (Es)
416
+ type: wer
417
+ value: 2.94
418
+ - task:
419
+ type: Automatic Speech Recognition
420
+ name: automatic-speech-recognition
421
+ dataset:
422
+ name: Multilingual LibriSpeech
423
+ type: facebook/multilingual_librispeech
424
+ config: french
425
+ split: test
426
+ args:
427
+ language: fr
428
+ metrics:
429
+ - name: Test WER (Fr)
430
+ type: wer
431
+ value: 3.36
432
+ - task:
433
+ type: Automatic Speech Recognition
434
+ name: automatic-speech-recognition
435
+ dataset:
436
+ name: Multilingual LibriSpeech
437
+ type: facebook/multilingual_librispeech
438
+ config: italian
439
+ split: test
440
+ args:
441
+ language: it
442
+ metrics:
443
+ - name: Test WER (It)
444
+ type: wer
445
+ value: 9.16
446
+ - task:
447
+ type: Automatic Speech Recognition
448
+ name: automatic-speech-recognition
449
+ dataset:
450
+ name: Multilingual LibriSpeech
451
+ type: facebook/multilingual_librispeech
452
+ config: dutch
453
+ split: test
454
+ args:
455
+ language: nl
456
+ metrics:
457
+ - name: Test WER (Nl)
458
+ type: wer
459
+ value: 11.27
460
+ - task:
461
+ type: Automatic Speech Recognition
462
+ name: automatic-speech-recognition
463
+ dataset:
464
+ name: Multilingual LibriSpeech
465
+ type: facebook/multilingual_librispeech
466
+ config: polish
467
+ split: test
468
+ args:
469
+ language: pl
470
+ metrics:
471
+ - name: Test WER (Pl)
472
+ type: wer
473
+ value: 8.77
474
+ - task:
475
+ type: Automatic Speech Recognition
476
+ name: automatic-speech-recognition
477
+ dataset:
478
+ name: Multilingual LibriSpeech
479
+ type: facebook/multilingual_librispeech
480
+ config: portuguese
481
+ split: test
482
+ args:
483
+ language: pt
484
+ metrics:
485
+ - name: Test WER (Pt)
486
+ type: wer
487
+ value: 8.14
488
+ # CoVoST2 ASR Results
489
+ - task:
490
+ type: Automatic Speech Recognition
491
+ name: automatic-speech-recognition
492
+ dataset:
493
+ name: CoVoST2
494
+ type: covost2
495
+ config: de
496
+ split: test
497
+ args:
498
+ language: de
499
+ metrics:
500
+ - name: Test WER (De)
501
+ type: wer
502
+ value: 5.53
503
+ - task:
504
+ type: Automatic Speech Recognition
505
+ name: automatic-speech-recognition
506
+ dataset:
507
+ name: CoVoST2
508
+ type: covost2
509
+ config: en
510
+ split: test
511
+ args:
512
+ language: en
513
+ metrics:
514
+ - name: Test WER (En)
515
+ type: wer
516
+ value: 6.85
517
+ - task:
518
+ type: Automatic Speech Recognition
519
+ name: automatic-speech-recognition
520
+ dataset:
521
+ name: CoVoST2
522
+ type: covost2
523
+ config: es
524
+ split: test
525
+ args:
526
+ language: es
527
+ metrics:
528
+ - name: Test WER (Es)
529
+ type: wer
530
+ value: 3.81
531
+ - task:
532
+ type: Automatic Speech Recognition
533
+ name: automatic-speech-recognition
534
+ dataset:
535
+ name: CoVoST2
536
+ type: covost2
537
+ config: et
538
+ split: test
539
+ args:
540
+ language: et
541
+ metrics:
542
+ - name: Test WER (Et)
543
+ type: wer
544
+ value: 18.28
545
+ - task:
546
+ type: Automatic Speech Recognition
547
+ name: automatic-speech-recognition
548
+ dataset:
549
+ name: CoVoST2
550
+ type: covost2
551
+ config: fr
552
+ split: test
553
+ args:
554
+ language: fr
555
+ metrics:
556
+ - name: Test WER (Fr)
557
+ type: wer
558
+ value: 6.30
559
+ - task:
560
+ type: Automatic Speech Recognition
561
+ name: automatic-speech-recognition
562
+ dataset:
563
+ name: CoVoST2
564
+ type: covost2
565
+ config: it
566
+ split: test
567
+ args:
568
+ language: it
569
+ metrics:
570
+ - name: Test WER (It)
571
+ type: wer
572
+ value: 4.80
573
+ - task:
574
+ type: Automatic Speech Recognition
575
+ name: automatic-speech-recognition
576
+ dataset:
577
+ name: CoVoST2
578
+ type: covost2
579
+ config: lv
580
+ split: test
581
+ args:
582
+ language: lv
583
+ metrics:
584
+ - name: Test WER (Lv)
585
+ type: wer
586
+ value: 11.49
587
+ - task:
588
+ type: Automatic Speech Recognition
589
+ name: automatic-speech-recognition
590
+ dataset:
591
+ name: CoVoST2
592
+ type: covost2
593
+ config: nl
594
+ split: test
595
+ args:
596
+ language: nl
597
+ metrics:
598
+ - name: Test WER (Nl)
599
+ type: wer
600
+ value: 6.93
601
+ - task:
602
+ type: Automatic Speech Recognition
603
+ name: automatic-speech-recognition
604
+ dataset:
605
+ name: CoVoST2
606
+ type: covost2
607
+ config: pt
608
+ split: test
609
+ args:
610
+ language: pt
611
+ metrics:
612
+ - name: Test WER (Pt)
613
+ type: wer
614
+ value: 6.87
615
+ - task:
616
+ type: Automatic Speech Recognition
617
+ name: automatic-speech-recognition
618
+ dataset:
619
+ name: CoVoST2
620
+ type: covost2
621
+ config: ru
622
+ split: test
623
+ args:
624
+ language: ru
625
+ metrics:
626
+ - name: Test WER (Ru)
627
+ type: wer
628
+ value: 5.14
629
+ - task:
630
+ type: Automatic Speech Recognition
631
+ name: automatic-speech-recognition
632
+ dataset:
633
+ name: CoVoST2
634
+ type: covost2
635
+ config: sl
636
+ split: test
637
+ args:
638
+ language: sl
639
+ metrics:
640
+ - name: Test WER (Sl)
641
+ type: wer
642
+ value: 7.59
643
+ - task:
644
+ type: Automatic Speech Recognition
645
+ name: automatic-speech-recognition
646
+ dataset:
647
+ name: CoVoST2
648
+ type: covost2
649
+ config: sv
650
+ split: test
651
+ args:
652
+ language: sv
653
+ metrics:
654
+ - name: Test WER (Sv)
655
+ type: wer
656
+ value: 13.32
657
+ - task:
658
+ type: Automatic Speech Recognition
659
+ name: automatic-speech-recognition
660
+ dataset:
661
+ name: CoVoST2
662
+ type: covost2
663
+ config: uk
664
+ split: test
665
+ args:
666
+ language: uk
667
+ metrics:
668
+ - name: Test WER (Uk)
669
+ type: wer
670
+ value: 18.15
671
  ---
672
  ## <span style="color:#ffb300;">🐤 Canary 1B v2: Multitask Speech Transcription and Translation Model </span>
673