stable-audio-open-small-tflite / README.md

Update README.md

e4693a9 verified 3 months ago

3.58 kB

	---
	license: other
	license_name: stability-ai-community
	license_link: https://huggingface.co/stabilityai/stable-audio-open-small/blob/main/LICENSE
	extra_gated_heading: "Access to Stability AI Models Required"
	extra_gated_description: "This model requires access to Stability AI's original model. Please request access to stable-audio-open-small first, then come back here."
	extra_gated_button_content: "I have Stability AI access"
	extra_gated_prompt: "By requesting access, you confirm that you have been granted access to Stability AI's stable-audio-open-small model on Hugging Face."
	extra_gated_fields:
	Stability AI Username: text
	I confirm I have access to stable-audio-open-small: checkbox
	---

	# Stable Audio Open Small - TensorFlow Lite + Runner

	Complete package for running Stability AI's `stable-audio-open-small` model on edge devices.

	Powered by Stability AI

	## Contents

	- `conditioners_float32.tflite` - T5 text encoder model
	- `dit_model.tflite` - Diffusion transformer model
	- `autoencoder_model.tflite` - Audio decoder model
	- `spiece.model` - SentencePiece tokenizer model
	- `audiogen.exe` - Inference runner binary

	## Quick Start

	```bash
	./audiogen.exe ./ "ambient forest sounds" 4 42
	```

	Arguments: `<models_path> <prompt> <num_threads> <seed>`

	The runner will generate an `output.wav` file with 10 seconds of stereo audio at 44.1kHz.

	## Model Description

	This is a converted version of the original [stable-audio-open-small](https://huggingface.co/stabilityai/stable-audio-open-small) model by Stability AI, split into three optimized TensorFlow Lite models for mobile and edge deployment. Includes a ready-to-use C++ inference runner.

	Architecture:
	- Text Encoder: T5-based conditioner for prompt processing
	- Diffusion Model: DiT (Diffusion Transformer) for latent generation
	- Audio Decoder: Autoencoder for converting latents to audio

	## Components & Licenses

	\| Component \| License \| Copyright \|
	\|-----------\|---------\|-----------\|
	\| AI Models (*.tflite) \| Stability AI Community License \| Stability AI Ltd. \|
	\| Tokenizer (spiece.model) \| Apache 2.0 \| Google \|
	\| Runner Binary (audiogen.exe) \| Apache 2.0 \| ARM Limited \|

	## Usage Rights

	- ✅ Research & Non-Commercial: Free to use
	- ✅ Commercial Use: Free for organizations with <$1M annual revenue (registration required at [stability.ai](https://stability.ai/community-license))
	- ⚠️ Commercial Use: License terminates if annual revenue exceeds $1M (enterprise license required)

	## License

	This repository is primarily governed by the Stability AI Community License Agreement due to the AI models.

	The SentencePiece tokenizer and ARM inference code components are under Apache 2.0, but the overall package follows the most restrictive license (Stability AI Community License).

	See the complete license terms: [Stability AI Community License](https://huggingface.co/stabilityai/stable-audio-open-small/blob/main/LICENSE)

	## Attribution

	This Stability AI Model is licensed under the Stability AI Community License, Copyright © Stability AI Ltd. All Rights Reserved.

	Original Model: [stable-audio-open-small](https://huggingface.co/stabilityai/stable-audio-open-small) by Stability AI
	Conversion: TensorFlow Lite conversion by InnerMost47
	Runner: Based on ARM's TensorFlow Lite inference code

	## Disclaimer

	This package is provided "AS IS" without warranties. The TensorFlow Lite conversion may affect model performance. Please refer to the original model documentation for detailed specifications.