update readme
Browse files
README.md
CHANGED
|
@@ -152,7 +152,7 @@ To understand the capabilities, Phi-4-multimodal-instruct was compared with a s
|
|
| 152 |
|
| 153 |
The Phi-4-multimodal-instruct was observed as
|
| 154 |
- Having strong automatic speech recognition (ASR) and speech translation (ST) performance, surpassing expert ASR model WhisperV3 and ST models SeamlessM4T-v2-Large.
|
| 155 |
-
- Ranking number 1 on the Huggingface OpenASR leaderboard with word error rate 6.14% in comparison with the current best model 6.5% as of
|
| 156 |
- Being the first open-sourced model that can perform speech summarization, and the performance is close to GPT4o.
|
| 157 |
- Having a gap with close models, e.g. Gemini-1.5-Flash and GPT-4o-realtime-preview, on speech QA task. Work is being undertaken to improve this capability in the next iterations.
|
| 158 |
|
|
@@ -468,8 +468,6 @@ response = processor.batch_decode(
|
|
| 468 |
print(f'>>> Response\n{response}')
|
| 469 |
```
|
| 470 |
|
| 471 |
-
**Notes**:
|
| 472 |
-
|
| 473 |
## Responsible AI Considerations
|
| 474 |
|
| 475 |
Like other language models, the Phi family of models can potentially behave in ways that are unfair, unreliable, or offensive. Some of the limiting behaviors to be aware of include:
|
|
|
|
| 152 |
|
| 153 |
The Phi-4-multimodal-instruct was observed as
|
| 154 |
- Having strong automatic speech recognition (ASR) and speech translation (ST) performance, surpassing expert ASR model WhisperV3 and ST models SeamlessM4T-v2-Large.
|
| 155 |
+
- Ranking number 1 on the [Huggingface OpenASR](https://huggingface.co/spaces/hf-audio/open_asr_leaderboard) leaderboard with word error rate 6.14% in comparison with the current best model 6.5% as of March 04, 2025.
|
| 156 |
- Being the first open-sourced model that can perform speech summarization, and the performance is close to GPT4o.
|
| 157 |
- Having a gap with close models, e.g. Gemini-1.5-Flash and GPT-4o-realtime-preview, on speech QA task. Work is being undertaken to improve this capability in the next iterations.
|
| 158 |
|
|
|
|
| 468 |
print(f'>>> Response\n{response}')
|
| 469 |
```
|
| 470 |
|
|
|
|
|
|
|
| 471 |
## Responsible AI Considerations
|
| 472 |
|
| 473 |
Like other language models, the Phi family of models can potentially behave in ways that are unfair, unreliable, or offensive. Some of the limiting behaviors to be aware of include:
|