nexa4ai commited on
Commit
f1daf2f
·
verified ·
1 Parent(s): c97477b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -12
README.md CHANGED
@@ -3,17 +3,10 @@ license: cc
3
  tags:
4
  - multimodal
5
  ---
6
- # **OmniNeural** — World’s First Multimodal Model Designed for NPU
7
 
8
  ## **Overview**
9
- **OmniNeural** is the first multimodal model designed specifically for Neural Processing Units (NPUs). It natively understands **text, images, and audio**, and runs across PCs, mobile devices, vehicles, IoT, and robotics.
10
-
11
- By co-designing the software and model architecture with NPU hardware, OmniNeural achieves:
12
- - **Up to 1.5× faster than CPU and 4× faster than GPU** for inference on consumer devices (e.g., Samsung S25 Ultra) .
13
- - **2–4× better efficiency than CPU and 4–8× better than GPU** in battery usage .
14
- - **Smooth multitasking**, running large generative AI models without slowing other applications .
15
-
16
- This combination of speed, efficiency, and NPU support makes OmniNeural the most practical multimodal foundation for edge intelligence.
17
 
18
  ---
19
 
@@ -46,15 +39,15 @@ This combination of speed, efficiency, and NPU support makes OmniNeural the most
46
  ## **Performance / Benchmarks**
47
  ### Human Evaluation (vs baselines)
48
  - **Vision**: Wins/ties in ~75% of prompts against Apple Foundation, Gemma-3n-E4B, Qwen2.5-Omni-3B.
49
- - **Audio**: Clear lead over baselines, especially in Whisper-encoder style tasks.
50
  - **Text**: Matches or outperforms leading multimodal baselines.
51
 
52
 
53
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/imf2Q9H9EgTzWklg2Wm4_.png)
54
 
55
  ### Nexa Attention Speedups
56
- - **9× faster** audio encoding (vs Whisper).
57
- - **3.5× faster** image encoding (vs SigLIP).
58
 
59
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/EqdcVCRnb6jpK_ckFyo5z.png)
60
 
 
3
  tags:
4
  - multimodal
5
  ---
6
+ # **OmniNeural** — World’s First NPU-aware Multimodal Model
7
 
8
  ## **Overview**
9
+ **OmniNeural** is the first fully multimodal model designed specifically for Neural Processing Units (NPUs). It natively understands **text, images, and audio**, and runs across PCs, mobile devices, automobile, IoT, and robotics.
 
 
 
 
 
 
 
10
 
11
  ---
12
 
 
39
  ## **Performance / Benchmarks**
40
  ### Human Evaluation (vs baselines)
41
  - **Vision**: Wins/ties in ~75% of prompts against Apple Foundation, Gemma-3n-E4B, Qwen2.5-Omni-3B.
42
+ - **Audio**: Clear lead over baselines, much better than Gemma3n and Apple foundation model.
43
  - **Text**: Matches or outperforms leading multimodal baselines.
44
 
45
 
46
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/imf2Q9H9EgTzWklg2Wm4_.png)
47
 
48
  ### Nexa Attention Speedups
49
+ - **9× faster** audio encoding (vs Whisper encoder).
50
+ - **3.5× faster** image encoding (vs SigLIP encoder).
51
 
52
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6618e0424dbef6bd3c72f89a/EqdcVCRnb6jpK_ckFyo5z.png)
53