license: apache-2.0
base_model:
- HuggingFaceTB/SmolLM3-3B
pipeline_tag: text-generation
library_name: optimum-executorch
tags:
- executorch
- transformers
- optimum-executorch
- smollm
Run on-device with ExecuTorch
This optimized model is exported to ExecuTorch and can run on edge devices.
Once ExecuTorch is set-up, you can directly download the *.pte and tokenizer file and run the model in a mobile app (see Running in a mobile app).
Export to ExecuTorch
First need to install the required packages:
pip install git+https://github.com/huggingface/optimum-executorch@main
Then update the dependencies to latest in order to work on the SmolLM3-3B:
python install_dev.py
Use optimum-cli to export the model to ExecuTorch:
optimum-cli export executorch \
--model HuggingFaceTB/SmolLM3-3B \
--task text-generation \
--recipe xnnpack \
--use_custom_sdpa \
--use_custom_kv_cache \
--qlinear \
--qembedding \
--output_dir ./smollm3_3b
Disclaimer
PyTorch has not performed safety evaluations or red teamed the quantized models. Performance characteristics, outputs, and behaviors may differ from the original models. Users are solely responsible for selecting appropriate use cases, evaluating and mitigating for accuracy, safety, and fairness, ensuring security, and complying with all applicable laws and regulations.
Nothing contained in this Model Card should be interpreted as or deemed a restriction or modification to the licenses the models are released under, including any limitations of liability or disclaimers of warranties provided therein.