SpeechTek
/

mEUltilingual_speechllm_linear_projector_v1

Automatic Speech Recognition

Model card Files Files and versions

seraphina commited on Jun 16

Commit

f2bf3bb

·

verified ·

1 Parent(s): 81daced

Update README.md

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -28,7 +28,7 @@ This model is trained for Automatic Speech Recognition (ASR).
 ## How to Get Started with the Model
-This linear projector can be used using the shell scripts provided in the [SLAM-ASR](https://github.com/X-LANCE/SLAM-LLM/tree/main/examples/asr_librispeech) codebase. Kindly refer to the instructions there for further details.
 Whisper-large-v3-turbo and EuroLLM 1.7B must be downloaded before using this linear projector.
@@ -41,11 +41,11 @@ Specifically, the training set consisted of 92.5 hours of Common Voice data + 7.
 ### Training Procedure
-* The linear projector was trained using the code-based provided by the official [SLAM-ASR Github repository](https://github.com/X-LANCE/SLAM-LLM/tree/main/examples/asr_librispeech) with `torchrun`.
 * Only the linear projector was trained.
-* The whisper-large-v3-turbo speech encoder (Whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo))
 and LLM ([EuroLLM-1.7B](https://huggingface.co/utter-project/EuroLLM-1.7B)) were kept frozen.
-* No prompt was used during training and inference
 * Training was conducted with one NVIDIA Ada Lovelace L40S GPU.

 ## How to Get Started with the Model
+This linear projector can be utilised for further finetuning or decoding using the shell scripts provided in the [SLAM-ASR](https://github.com/X-LANCE/SLAM-LLM/tree/main/examples/asr_librispeech) codebase. Kindly refer to the instructions there for further details.
 Whisper-large-v3-turbo and EuroLLM 1.7B must be downloaded before using this linear projector.
 ### Training Procedure
+* The model was trained using the code-based provided by the official [SLAM-ASR Github repository](https://github.com/X-LANCE/SLAM-LLM/tree/main/examples/asr_librispeech) with `torchrun`.
 * Only the linear projector was trained.
+* The whisper-large-v3-turbo speech encoder ([Whisper-large-v3-turbo](https://huggingface.co/openai/whisper-large-v3-turbo))
 and LLM ([EuroLLM-1.7B](https://huggingface.co/utter-project/EuroLLM-1.7B)) were kept frozen.
+* No prompt was used during training and inference.
 * Training was conducted with one NVIDIA Ada Lovelace L40S GPU.