FluidInference
/

parakeet-tdt-0.6b-v2-coreml

Automatic Speech Recognition

hf-asr-leaderboard

Model card Files Files and versions

parakeet-tdt-0.6b-v2-coreml / README.md

bweng's picture

Update README.md

a95894b verified 2 months ago

|

1.6 kB

	---
	license: cc-by-4.0
	thumbnail: null
	widget:
	- example_title: Librispeech sample 1
	src: https://cdn-media.huggingface.co/speech_samples/sample1.flac
	- example_title: Librispeech sample 2
	src: https://cdn-media.huggingface.co/speech_samples/sample2.flac
	metrics:
	- wer
	tags:
	- automatic-speech-recognition
	- speech
	- audio
	- Transducer
	- TDT
	- FastConformer
	- Conformer
	- pytorch
	- NeMo
	- hf-asr-leaderboard
	language:
	- en
	pipeline_tag: automatic-speech-recognition
	library_name: nemo
	base_model:
	- nvidia/parakeet-tdt-0.6b-v2
	---

	# Parakeet TDT 0.6B V2 - CoreML

	This is a CoreML-optimized version of NVIDIA's Parakeet TDT 0.6B V2 model, designed for high-performance automatic speech recognition on Apple platforms.

	## Model Description

	Models will continue to evolve as we optimize performance and accuracy. This model has been converted to CoreML format for efficient on-device inference on Apple Silicon and iOS devices, enabling real-time speech recognition with
	minimal memory footprint.

	## Usage in Swift

	See the [FluidAudio repository](https://github.com/FluidInference/FluidAudioSwift) for instructions.

	## Performance

	- Real-time factor: < 0.3x on M1 Pro
	- Memory usage: ~800MB peak
	- Supported platforms: macOS 14+, iOS 17+
	- Optimized for: Apple Silicon

	## Model Details

	- Architecture: FastConformer-TDT
	- Parameters: 0.6B
	- Sample rate: 16kHz

	## License

	This model is released under the CC-BY-4.0 license. See the LICENSE file for details.

	Acknowledgments

	Based on NVIDIA's Parakeet TDT model. CoreML conversion and Swift integration by the FluidInference team.