Nexa AI

Team

company

https://nexa.ai/

nexa_ai

NexaAI

Activity Feed Request to join this org

AI & ML interests

On Device AI Deployment and Research

Recent Activity

nexaml updated a model 8 days ago

NexaAI/functiongemma-270m-it-GGUF

nexaml published a model 8 days ago

NexaAI/functiongemma-270m-it-GGUF

nexaml updated a model 9 days ago

NexaAI/Qwen3-VL-2B-Instruct-GGUF

View all activity

Papers

AutoNeural: Co-Designing Vision-Language Models for NPU Inference

View all Papers

NexaAI 's collections 12

Qualcomm NPU

Latest SOTA models supported on Qualcomm NPU.

NexaAI/AutoNeural

Image-Text-to-Text • Updated 23 days ago • 58 • 12
NexaAI/Ministral-3-3B-npu

Updated 18 days ago • 49
NexaAI/OmniNeural-4B

Any-to-Any • Updated Nov 7 • 95 • 160
NexaAI/rf-detr-seg-preview-npu

Object Detection • Updated 25 days ago • 26

Qualcomm NPU IoT

Multimodal models running on Qualcomm NPU for Qualcomm IQ9 and RB3

NexaAI/convnext-tiny-npu-IoT

Updated Nov 11 • 9
NexaAI/Granite-4.0-h-350M-NPU-IoT

Updated 24 days ago • 21
NexaAI/convnext-tiny-npu-IoT-rb3

Updated 24 days ago • 22

Qwen3VL

Nexa AI infra to support Qwen3VL running on GPU/NPU/CPU

NexaAI/Qwen3-VL-4B-Instruct-GGUF

Image-Text-to-Text • 4B • Updated 12 days ago • 2.18k • 29
NexaAI/Qwen3-VL-4B-Thinking-GGUF

Image-Text-to-Text • 4B • Updated Oct 27 • 327 • 6
NexaAI/Qwen3-VL-8B-Instruct-GGUF

Image-Text-to-Text • Updated 10 days ago • 898 • 21
NexaAI/Qwen3-VL-8B-Thinking-GGUF

Image-Text-to-Text • 8B • Updated Oct 27 • 323 • 12

Multimodal - MLX

Language Models that takes vision input and/or audio input, hand picked by Nexa Team.

NexaAI/gemma-3n-E4B-it-4bit-MLX

Image-Text-to-Text • Updated Jul 22 • 34 • 2
NexaAI/Qwen2.5-VL-7B-Instruct-4bit-MLX

Image-Text-to-Text • 2B • Updated Jul 22 • 38
NexaAI/SmolVLM-500M-Instruct-8bit-MLX

Image-Text-to-Text • 0.7B • Updated Jul 22 • 19
NexaAI/SmolVLM-Instruct-8bit-MLX

Image-Text-to-Text • 0.7B • Updated Jul 22 • 12

Multimodal - GGUF

Language Models that takes vision input and/or audio input, hand picked by Nexa Team.

NexaAI/qwen2.5vl

Image-Text-to-Text • 8B • Updated Jul 22 • 97 • 1
NexaAI/Qwen2.5-Omni-3B-GGUF

Any-to-Any • 3B • Updated Nov 19 • 1.58k • 4

NexaQuant Models

NexaQuant compresses models with 100% accuracy recovery.

NexaAI/DeepSeek-R1-Distill-Llama-8B-NexaQuant

8B • Updated Feb 19 • 70 • 90
NexaAI/DeepSeek-R1-Distill-Qwen-1.5B-NexaQuant

2B • Updated Feb 19 • 153 • 93

Apple Neural Engine

Latest SOTA models supported on Apple Neural Engine

NexaAI/Ministral-3-3B-ANE

Updated 23 days ago • 23 • 1
NexaAI/Gemma3-1B-ANE

Updated 9 days ago • 64 • 1
NexaAI/Qwen3-0.6B-ANE

Updated Nov 18 • 56 • 1
NexaAI/Granite-4-Micro-ANE

Text Generation • Updated Nov 18 • 18 • 1

Qualcomm NPU Mobile

Multimodal models running on Qualcomm NPU for Snapdragon8 Gen4

NexaAI/OmniNeural-4B-mobile

Any-to-Any • Updated Nov 15 • 31 • 2
NexaAI/Granite-4.0-h-350M-NPU-mobile

Updated 18 days ago • 1
NexaAI/Granite-4-Micro-NPU-mobile

Text Generation • Updated Nov 15 • 1
NexaAI/embedneural-npu-mobile

Updated Nov 18 • 1

Intel NPU

Latest SOTA models supported on Intel NPU

NexaAI/llama3.2-3B-intel-npu

Updated 16 days ago • 44
NexaAI/llama3.2-1B-intel-npu

Updated 16 days ago • 76
NexaAI/deepSeek-r1-distill-qwen-1.5B-intel-npu

Updated 16 days ago • 87
NexaAI/deepSeek-r1-distill-qwen-7B-intel-npu

Updated 16 days ago • 43

LLM - MLX

Text Generations Models in MLX format, hand picked by Nexa Team.

NexaAI/Qwen3-4B-4bit-MLX

Text Generation • 0.6B • Updated Jul 22 • 15
NexaAI/Qwen3-1.7B-4bit-MLX

Text Generation • 0.3B • Updated Jul 22 • 17 • 1
NexaAI/Qwen3-0.6B-bf16-MLX

Text Generation • 0.6B • Updated Jul 22 • 9
NexaAI/Qwen3-0.6B-8bit-MLX

Text Generation • 0.2B • Updated Jul 22 • 12

LLM - GGUF

Text Generations Models in GGUF format, hand picked by Nexa Team.

NexaAI/gpt-oss-20b-GGUF

Text Generation • 21B • Updated Aug 7 • 21 • 1
NexaAI/Qwen3-4B-GGUF

Text Generation • 4B • Updated Aug 9 • 2.58k • 1
NexaAI/Qwen3-0.6B-GGUF

Text Generation • 0.6B • Updated Aug 9 • 716 • 1

Nexa Models

Tiny, multimodal on-device models developed by Nexa AI.

NexaAI/Octopus-v2

Text Generation • 3B • Updated May 21, 2024 • 544 • 891
NexaAI/OmniVLM-968M

0.5B • Updated Aug 20 • 2.93k • 528
NexaAI/octo-net

Text Generation • 4B • Updated May 5, 2024 • 71 • 144
NexaAI/octo-planner-2b

Text Generation • 3B • Updated Jun 27, 2024 • 22 • 10

Qualcomm NPU

Latest SOTA models supported on Qualcomm NPU.

NexaAI/AutoNeural

Image-Text-to-Text • Updated 23 days ago • 58 • 12
NexaAI/Ministral-3-3B-npu

Updated 18 days ago • 49
NexaAI/OmniNeural-4B

Any-to-Any • Updated Nov 7 • 95 • 160
NexaAI/rf-detr-seg-preview-npu

Object Detection • Updated 25 days ago • 26

Apple Neural Engine

Latest SOTA models supported on Apple Neural Engine

NexaAI/Ministral-3-3B-ANE

Updated 23 days ago • 23 • 1
NexaAI/Gemma3-1B-ANE

Updated 9 days ago • 64 • 1
NexaAI/Qwen3-0.6B-ANE

Updated Nov 18 • 56 • 1
NexaAI/Granite-4-Micro-ANE

Text Generation • Updated Nov 18 • 18 • 1

Qualcomm NPU IoT

Multimodal models running on Qualcomm NPU for Qualcomm IQ9 and RB3

NexaAI/convnext-tiny-npu-IoT

Updated Nov 11 • 9
NexaAI/Granite-4.0-h-350M-NPU-IoT

Updated 24 days ago • 21
NexaAI/convnext-tiny-npu-IoT-rb3

Updated 24 days ago • 22

Qualcomm NPU Mobile

Multimodal models running on Qualcomm NPU for Snapdragon8 Gen4

NexaAI/OmniNeural-4B-mobile

Any-to-Any • Updated Nov 15 • 31 • 2
NexaAI/Granite-4.0-h-350M-NPU-mobile

Updated 18 days ago • 1
NexaAI/Granite-4-Micro-NPU-mobile

Text Generation • Updated Nov 15 • 1
NexaAI/embedneural-npu-mobile

Updated Nov 18 • 1

Qwen3VL

Nexa AI infra to support Qwen3VL running on GPU/NPU/CPU

NexaAI/Qwen3-VL-4B-Instruct-GGUF

Image-Text-to-Text • 4B • Updated 12 days ago • 2.18k • 29
NexaAI/Qwen3-VL-4B-Thinking-GGUF

Image-Text-to-Text • 4B • Updated Oct 27 • 327 • 6
NexaAI/Qwen3-VL-8B-Instruct-GGUF

Image-Text-to-Text • Updated 10 days ago • 898 • 21
NexaAI/Qwen3-VL-8B-Thinking-GGUF

Image-Text-to-Text • 8B • Updated Oct 27 • 323 • 12

Intel NPU

Latest SOTA models supported on Intel NPU

NexaAI/llama3.2-3B-intel-npu

Updated 16 days ago • 44
NexaAI/llama3.2-1B-intel-npu

Updated 16 days ago • 76
NexaAI/deepSeek-r1-distill-qwen-1.5B-intel-npu

Updated 16 days ago • 87
NexaAI/deepSeek-r1-distill-qwen-7B-intel-npu

Updated 16 days ago • 43

Multimodal - MLX

Language Models that takes vision input and/or audio input, hand picked by Nexa Team.

NexaAI/gemma-3n-E4B-it-4bit-MLX

Image-Text-to-Text • Updated Jul 22 • 34 • 2
NexaAI/Qwen2.5-VL-7B-Instruct-4bit-MLX

Image-Text-to-Text • 2B • Updated Jul 22 • 38
NexaAI/SmolVLM-500M-Instruct-8bit-MLX

Image-Text-to-Text • 0.7B • Updated Jul 22 • 19
NexaAI/SmolVLM-Instruct-8bit-MLX

Image-Text-to-Text • 0.7B • Updated Jul 22 • 12