Qwen2.5-Omni-3B

This version of Qwen2.5-Omni-3B has been converted to run on the Axera NPU using w8a16 quantization.

Compatible with Pulsar2 version: 3.4

Convert tools links:

For those who are interested in model conversion, you can try to export axmodel through the original repo : https://huggingface.co/Qwen/Qwen2.5-Omni-3B

Pulsar2 Link, How to Convert LLM from Huggingface to axmodel

AXera NPU HOST LLM Runtime

Support Platform

AX650
- AX650N DEMO Board
- M4N-Dock(爱芯派Pro)
- M.2 Accelerator card

Audio Process 待补充

Image Process 待补充

Video Process 待补充

The DDR capacity refers to the CMM memory that needs to be consumed. Ensure that the CMM memory allocation on the development board is greater than this value.

How to use

Download all files from this repository to the device

If you using AX650 Board

(base) lihongjie@ai-dev1:~/axera-tech/Qwen2.5-Omni-3B$ tree -L 2
.
├── python
│   ├── modeling_axinfer.py
│   ├── preprocess.py
│   ├── run_axinfer.py
│   ├── utils_axinfer.py
│   ├── utils_lm.py
│   ├── utils.py
│   └── utils_vl.py
├── Qwen2.5-Omni-3B-AX650-talker-chunk_prefill_512
│   ├── config.json
│   ├── qwen2_5_omni_talker_p128_l0_together.axmodel
......
│   ├── qwen2_5_omni_talker_post.axmodel
│   ├── special_tokens_map.json
│   ├── spk_dict.pt
│   ├── thinker_to_talker_proj_decode.axmodel
│   ├── thinker_to_talker_proj_prefill_512.axmodel
│   ├── token2wav_bigvgan.axmodel
│   ├── token2wav_dit.axmodel
│   ├── tokenizer_config.json
│   ├── tokenizer.json
│   └── vocab.json
├── Qwen2.5-Omni-3B-AX650-thinker-chunk_prefill_512
│   ├── config.json
......
│   ├── Qwen2.5-Omni-3B_vision.axmodel
│   ├── qwen2_5_omni_text_p128_l0_together.axmodel
......
│   ├── qwen2_5_omni_text_post.axmodel
├── README.md
└── videos

Demo

运行命令

cd python
python3 run_axinfer.py

output

text

It's a Nord Electro 6 keyboard. It's a really popular one. It has a lot of features like different sounds and effects. Have you played with it much?

audio

AXERA-TECH
/

Qwen2.5-Omni-3B