Qualcomm NPU
Collection
Latest SOTA models supported on Qualcomm NPU.
•
8 items
•
Updated
Qwen3-4B is a 4-billion-parameter general-purpose language model from the Qwen team at Alibaba Cloud.
Part of the Qwen3 series, it balances strong language understanding, reasoning, and generation performance with efficient deployment at smaller scale.
Trained on a large, high-quality multilingual dataset, Qwen3-4B supports a broad range of NLP tasks and can be fine-tuned for specialized domains.
Input:
Output:
⚠️ Hardware requirement: the model currently runs only on Qualcomm NPUs (e.g., Snapdragon-powered AIPC).
Apple NPU support is planned next.
Create a token in the Model Hub, then log in:
nexa config set license '<access_token>'
Running:
nexa infer NexaAI/qwen3-4B-npu