giuseppe-tanzi commited on
Commit
867d40b
·
verified ·
1 Parent(s): 8525e7c

Upload folder using huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +1 -4
README.md CHANGED
@@ -8,7 +8,7 @@ tags:
8
  - seamless
9
  - subtitle-editing-time-prediction
10
  library_name: transformers
11
- pipeline_tag: audio-regression
12
  ---
13
 
14
  # videoloc/seamless-basic
@@ -24,7 +24,6 @@ The model is built on top of Meta's SeamlessM4T and fine-tuned on a multimodal d
24
  - **Multimodal Processing**: Simultaneously processes audio (16kHz) and text inputs
25
  - **Frozen Encoders**: Uses pre-trained SeamlessM4T encoders (frozen for stability)
26
  - **TTE Prediction**: Predicts editing time required for subtitle segments
27
- - **Efficient Architecture**: Optimized for inference with gradient checkpointing support
28
  - **Direct Output**: Raw time values in seconds for immediate use
29
 
30
  ## Model Architecture
@@ -156,8 +155,6 @@ data = [
156
  - **Dataset Split**: 80/20 train/test
157
  - **Random Seed**: 42
158
  - **Metric**: RMSE (lower is better)
159
- - **Audio Caching**: Enabled with compression
160
- - **Workers**: 8
161
 
162
  ## Training Configuration
163
 
 
8
  - seamless
9
  - subtitle-editing-time-prediction
10
  library_name: transformers
11
+ base_model: facebook/hf-seamless-m4t-medium
12
  ---
13
 
14
  # videoloc/seamless-basic
 
24
  - **Multimodal Processing**: Simultaneously processes audio (16kHz) and text inputs
25
  - **Frozen Encoders**: Uses pre-trained SeamlessM4T encoders (frozen for stability)
26
  - **TTE Prediction**: Predicts editing time required for subtitle segments
 
27
  - **Direct Output**: Raw time values in seconds for immediate use
28
 
29
  ## Model Architecture
 
155
  - **Dataset Split**: 80/20 train/test
156
  - **Random Seed**: 42
157
  - **Metric**: RMSE (lower is better)
 
 
158
 
159
  ## Training Configuration
160