Update README.md
Browse files
README.md
CHANGED
|
@@ -57,6 +57,14 @@ model-index:
|
|
| 57 |
|
| 58 |
# SpeechLLM
|
| 59 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
## Usage
|
| 61 |
```python
|
| 62 |
# Load model directly from huggingface
|
|
|
|
| 57 |
|
| 58 |
# SpeechLLM
|
| 59 |
|
| 60 |
+
SpeechLLM is a multi-modal LLM trained to predict the metadata of the speaker's turn in a conversation. SpeechLLM model is based on HubertX acoustic encoder and TinyLlama LLM. The model predicts the following:
|
| 61 |
+
1. Speech Activity
|
| 62 |
+
2. ASR Transcript
|
| 63 |
+
3. Gender of the speaker
|
| 64 |
+
4. Age of the speaker
|
| 65 |
+
5. Accent of the speaker
|
| 66 |
+
6. Emotion of the speaker
|
| 67 |
+
|
| 68 |
## Usage
|
| 69 |
```python
|
| 70 |
# Load model directly from huggingface
|