giuseppe-tanzi commited on
Commit
a6453d8
·
verified ·
1 Parent(s): 14dc36f

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -17,7 +17,7 @@ base_model: facebook/hf-seamless-m4t-medium
17
 
18
  This is a **SeamlessBasic** model that processes audio and text inputs to predict **Time To Edit (TTE)** for subtitle segments. Given an audio segment and its corresponding subtitle text, the model predicts how much time (in seconds) would be required to edit/refine that subtitle segment.
19
 
20
- The model is built on top of Meta's SeamlessM4T and fine-tuned on a multimodal dataset containing audio-subtitle pairs with editing time annotations.
21
 
22
  ### Key Features
23
 
@@ -164,7 +164,7 @@ data = [
164
 
165
  The model was trained with the following specifications:
166
 
167
- - **Dataset**: Multimodal audio-subtitle pairs with TTE annotations
168
  - **Train/Test Split**: 80/20 with random seed 42
169
  - **Audio Processing**: 16kHz sampling, max 8.0 seconds, no offset
170
  - **Text Processing**: Max 256 tokens
 
17
 
18
  This is a **SeamlessBasic** model that processes audio and text inputs to predict **Time To Edit (TTE)** for subtitle segments. Given an audio segment and its corresponding subtitle text, the model predicts how much time (in seconds) would be required to edit/refine that subtitle segment.
19
 
20
+ The model is built on top of Meta's SeamlessM4T and fine-tuned on a multimodal dataset containing audio-subtitle pairs with editing time annotations across 5 languages: **English, French, Spanish, Italian, and German**.
21
 
22
  ### Key Features
23
 
 
164
 
165
  The model was trained with the following specifications:
166
 
167
+ - **Dataset**: Multimodal audio-subtitle pairs with TTE annotations (5 languages: EN, FR, ES, IT, DE)
168
  - **Train/Test Split**: 80/20 with random seed 42
169
  - **Audio Processing**: 16kHz sampling, max 8.0 seconds, no offset
170
  - **Text Processing**: Max 256 tokens