EgypTalk-ASR-v2
NAMAA-Space/EgypTalk-ASR-v2 is a high-performance automatic speech recognition (ASR) model for Egyptian Arabic, trained using NVIDIA NeMo and optimized for real-world speech from native Egyptian speakers.
The model was trained on over 200 hours of high-quality, manually curated audio data collected and prepared by the NAMAA team. It is built upon NVIDIA’s FastConformer Hybrid Large architecture and fine-tuned for Egyptian Arabic, enabling highly accurate transcription in casual, formal, and mixed dialect settings.
Demo: Try it here
🗣️ Model Description
- Architecture: FastConformer Hybrid Large from NVIDIA NeMo ASR collection.
- Framework: PyTorch Lightning + NVIDIA NeMo.
- Languages: Egyptian Arabic (with capability to handle some Modern Standard Arabic).
- Dataset: 200+ hours of proprietary, high-quality audio for Egyptian Arabic, covering:
- Spontaneous conversation
- Broadcast media
- Interviews
- Read speech
- Tokenizer: SentencePiece (trained specifically for Egyptian Arabic phonetic coverage).
- Input Format: 16 kHz mono WAV files.
- Output: Raw transcribed text in Arabic.
🚀 Key Features
- Egyptian Arabic Dialect Optimized – Designed to handle local pronunciations, colloquialisms, and speech patterns.
- High Accuracy – Achieves strong WER performance on Egyptian test sets.
- FastConformer Efficiency – Low-latency, streaming-capable inference.
- Robust Dataset – Covers multiple domains (media, conversation, formal speech).
💻 Usage
import torch
from nemo.collections.asr.models import ASRModel
# Load model
model = ASRModel.from_pretrained("NAMAA-Space/EgypTalk-ASR-v2")
# Transcribe audio
transcription = model.transcribe(["sample.wav"])
print(transcription)
🛠️ Training Details
- Pretrained Base Model: nvidia/stt_ar_fastconformer_hybrid_large_pcd_v1.0
- Training Framework: PyTorch Lightning (DDP strategy)
- Training Duration: 100 epochs, mixed precision enabled
- Optimizer: Adam with learning rate 1e-3
- Batch Size: 32 (train) / 8 (validation, test)
- Augmentations: Silence trimming, start/end token usage
Citation
@misc{,
title={NAMAA-Space/EgypTalk-ASR-v2},
author={NAMAA},
year={2025},
publisher={Hugging Face},
howpublished={\url{https://huggingface.co/NAMAA-Space/NAMAA-Space/EgypTalk-ASR-v2}},
note={Accessed: 2025-03-02}
}
- Downloads last month
- 158
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support