FaisaI/tadabur
Updated • 11.5k • 15
Fine-tuned Whisper Medium on the Tadabur dataset for Quran ASR, Surah/Ayah identification, and reciter recognition.
CS465 Machine Learning Project — Spring 2026
Given a Quran audio recitation, the pipeline returns:
| Model | WER (%) | CER (%) |
|---|---|---|
| Whisper Medium Vanilla | 41.10% | 11.47% |
| Tadabur-Whisper-Small (Author) | 47.06% | 12.28% |
| This model | 6.26% | 4.41% |
| Metric | Value |
|---|---|
| Supported reciters | 335 |
| Validation accuracy | 98.47% |
| Training accuracy | 98.71% |
| File | Size | Description |
|---|---|---|
model.safetensors |
3.06 GB | Fine-tuned Whisper Medium weights |
reciter_classifier.pt |
2.76 MB | MLP reciter classifier |
reciter_idx_to_id.json |
1.25 KB | Classifier index → reciter ID |
reciter_id_to_idx.json |
1.25 KB | Reciter ID → classifier index |
sheikh_dict.json |
2.7 KB | Reciter ID → Arabic name |
surah_dict.json |
2.7 KB | Surah index → Arabic name |
quran_simple.json |
~3 MB | Full Quran text for matching |
supported_reciters.txt |
— | List of all 335 supported reciters |
pip install transformers torch librosa rapidfuzz huggingface_hub
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import librosa, torch
MODEL = "rakansuliman/tadabur-whisper-medium"
processor = WhisperProcessor.from_pretrained(MODEL)
model = WhisperForConditionalGeneration.from_pretrained(MODEL)
model.eval()
audio, _ = librosa.load("recitation.wav", sr=16000)
inputs = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
with torch.no_grad():
ids = model.generate(
inputs,
language="arabic",
task="transcribe",
max_new_tokens=225,
suppress_tokens=[],
forced_decoder_ids=None,
)
print(processor.batch_decode(ids, skip_special_tokens=True)[0])
from huggingface_hub import hf_hub_download
import torch, torch.nn as nn, json
MODEL = "rakansuliman/tadabur-whisper-medium"
# Download classifier files
hf_hub_download(MODEL, "reciter_classifier.pt", local_dir="./")
hf_hub_download(MODEL, "reciter_idx_to_id.json", local_dir="./")
hf_hub_download(MODEL, "sheikh_dict.json", local_dir="./")
# Define classifier (must match training architecture)
class ReciterClassifier(nn.Module):
def __init__(self, hidden_dim, num_classes):
super().__init__()
self.net = nn.Sequential(
nn.Linear(hidden_dim, 512), nn.BatchNorm1d(512), nn.ReLU(), nn.Dropout(0.3),
nn.Linear(512, 256), nn.BatchNorm1d(256), nn.ReLU(), nn.Dropout(0.2),
nn.Linear(256, num_classes),
)
def forward(self, x): return self.net(x)
# Load mappings
with open("reciter_idx_to_id.json") as f:
idx_to_id = {int(k): int(v) for k, v in json.load(f).items()}
with open("sheikh_dict.json", encoding="utf-8-sig") as f:
sheikh = {int(v): k for k, v in json.load(f).items()}
# Load classifier
clf = ReciterClassifier(1024, len(idx_to_id))
clf.load_state_dict(torch.load("reciter_classifier.pt", map_location="cpu"))
clf.eval()
# Run encoder + classify reciter
with torch.no_grad():
encoder_out = model.model.encoder(inputs)
embedding = encoder_out.last_hidden_state.mean(dim=1).float()
logits = clf(embedding)
pred_idx = logits.argmax(dim=1).item()
confidence = torch.softmax(logits, dim=1).max().item()
reciter_id = idx_to_id[pred_idx]
reciter_name = sheikh.get(reciter_id, f"ID {reciter_id}")
print(f"Reciter: {reciter_name} ({confidence*100:.1f}%)")
Audio Input (mic / file / video)
↓
Whisper Encoder ←─ runs once, shared
├── Whisper Decoder → Arabic text
└── MLP Classifier → Reciter name
↓
RapidFuzz matching against 6,236 ayahs
↓
Surah name + Ayah number + confidence
Linear(1024→512) → BatchNorm → ReLU → Dropout(0.3)
→ Linear(512→256) → BatchNorm → ReLU → Dropout(0.2)
→ Linear(256→335)
openai/whisper-mediumSee supported_reciters.txt for the full list of 335 supported reciters including:
عبد الباسط عبد الصمد، محمد صديق المنشاوي، ياسر الدوسري، سعود الشريم، ماهر المعيقلي، عبدالرحمن السديس، and 329 more.
@misc{suliman2026tadabur,
author = {Suliman, Rakan and Mamdoh, Abdulrahman and Aldosari, Hussam and Khalid, Mohammed},
title = {Tadabur: Quran ASR with Surah/Ayah Identification and Reciter Recognition},
year = {2026},
url = {https://huggingface.co/rakansuliman/tadabur-whisper-medium}
}
CC BY-NC 4.0 — Research and educational use only. Please engage with Quran content respectfully. 🤲
Base model
openai/whisper-medium