File size: 2,046 Bytes

b3e6cf8
 
 
 
 
 
 
 
 
 
 
 
92cc3f4
1d588d2
 
92cc3f4
2615cde
826e107
 
 
 
0c24b79

---
license: mit
datasets:
- mozilla-foundation/common_voice_11_0
language:
- fa
metrics:
- wer
base_model:
- openai/whisper-tiny
pipeline_tag: automatic-speech-recognition
library_name: transformers
---
this model trained on validation segment of data for one epoch with 0.05 loss and tested on test segment of data with 0.07 loss

how to use the model in colab:

       #start
       pip install torch torchaudio transformers librosa gradio
       from transformers import WhisperProcessor, WhisperForConditionalGeneration
       import torch
    
      #Load your fine-tuned Whisper model and processor
      model_name = "hackergeek98/tinyyyy_whisper"
      processor = WhisperProcessor.from_pretrained(model_name)
      model = WhisperForConditionalGeneration.from_pretrained(model_name)
    
      #Force the model to transcribe in Persian
      model.config.forced_decoder_ids = processor.get_decoder_prompt_ids(language="fa", task="transcribe")
      
      #Move model to GPU if available
      device = "cuda" if torch.cuda.is_available() else "cpu"
      model.to(device)
      import librosa
      
      def transcribe_audio(audio_file):
          # Load audio file using librosa (supports multiple formats)
          audio_data, sampling_rate = librosa.load(audio_file, sr=16000)  # Resample to 16kHz
      
          # Preprocess the audio
          inputs = processor(audio_data, sampling_rate=sampling_rate, return_tensors="pt").input_features.to(device)
      
          # Generate transcription
          with torch.no_grad():
              predicted_ids = model.generate(inputs)
      
          # Decode the transcription
          transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
          return transcription
      from google.colab import files
      
      #Upload an audio file
      uploaded = files.upload()
      audio_file = list(uploaded.keys())[0]
      
      #Transcribe the audio
      transcription = transcribe_audio(audio_file)
      print("Transcription:", transcription)