NexaAI
/

OmniNeural-4B

multimodal

NPU

On-device

Snapdragon PC

Android

Model card Files Files and versions Community

alanzhuly commited on 7 days ago

Commit

c97477b

verified ·

1 Parent(s): 57ca287

Update README.md

Browse files

Files changed (1) hide show

README.md +5 -12

README.md CHANGED Viewed

@@ -3,7 +3,7 @@ license: cc
 tags:
 - multimodal
 ---
-# **OmniNeural** — World’s First NPU-Optimized Multimodal Model
 ## **Overview**
 **OmniNeural** is the first multimodal model designed specifically for Neural Processing Units (NPUs). It natively understands **text, images, and audio**, and runs across PCs, mobile devices, vehicles, IoT, and robotics.
@@ -13,7 +13,7 @@ By co-designing the software and model architecture with NPU hardware, OmniNeura
 - **2–4× better efficiency than CPU and 4–8× better than GPU** in battery usage .
 - **Smooth multitasking**, running large generative AI models without slowing other applications .
-This combination of speed, efficiency, and NPU support makes OmniNeural the most practical multimodal foundation for edge intelligence.
 ---
@@ -23,7 +23,7 @@ This combination of speed, efficiency, and NPU support makes OmniNeural the most
 - **Hardware-Aware Attention** – Attention patterns tuned for NPU, lowering compute and memory demand .
 - **Native Static Graph** – Supports variable-length multimodal inputs with stable, predictable latency .
 - **Performance Gains** – **9× faster audio processing** and **3.5× faster image processing** on NPUs compared to baseline encoders .
-- **Privacy-First Inference** – All computation stays local: private, offline-capable, and cost-efficient.
 ---
@@ -50,20 +50,13 @@ This combination of speed, efficiency, and NPU support makes OmniNeural the most
 - **Text**: Matches or outperforms leading multimodal baselines.
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/nplpd2RWyL_cYj0t-xvhq.png)
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/0BK09jUWUnDqYjsKpdFcR.png)
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/LJNCxyU0OKp7Z4ecLSD9C.png)
 ### Nexa Attention Speedups
 - **9× faster** audio encoding (vs Whisper).
 - **3.5× faster** image encoding (vs SigLIP).
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/tKZ9zPjjZtVdGW2N3yBHp.png)
 ---

 tags:
 - multimodal
 ---
+# **OmniNeural** — World’s First Multimodal Model Designed for NPU
 ## **Overview**
 **OmniNeural** is the first multimodal model designed specifically for Neural Processing Units (NPUs). It natively understands **text, images, and audio**, and runs across PCs, mobile devices, vehicles, IoT, and robotics.
 - **2–4× better efficiency than CPU and 4–8× better than GPU** in battery usage .
 - **Smooth multitasking**, running large generative AI models without slowing other applications .
+This combination of speed, efficiency, and NPU support makes OmniNeural the most practical multimodal foundation for edge intelligence.
 ---
 - **Hardware-Aware Attention** – Attention patterns tuned for NPU, lowering compute and memory demand .
 - **Native Static Graph** – Supports variable-length multimodal inputs with stable, predictable latency .
 - **Performance Gains** – **9× faster audio processing** and **3.5× faster image processing** on NPUs compared to baseline encoders .
+- **Privacy-First Inference** – All computation stays local: private, offline-capable, and cost-efficient.
 ---
 - **Text**: Matches or outperforms leading multimodal baselines.
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/imf2Q9H9EgTzWklg2Wm4_.png)
 ### Nexa Attention Speedups
 - **9× faster** audio encoding (vs Whisper).
 - **3.5× faster** image encoding (vs SigLIP).
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/EqdcVCRnb6jpK_ckFyo5z.png)
 ---