LibCLIP / README.md
zheqiushui's picture
don`t need install pyaxengine
4d47f90 verified
|
raw
history blame
5.83 kB
metadata
license: mit
language:
  - en
  - zh
base_model:
  - OFA-Sys/chinese-clip-vit-large-patch14-336px
  - AXERA-TECH/cnclip
tags:
  - CLIP
  - CN_CLIP
pipeline_tag: zero-shot-image-classification

LibCLIP

This SDK enables efficient text-to-image retrieval using CLIP (Contrastive Language–Image Pretraining), optimized for Axera’s NPU-based SoC platforms including AX650, AX650C, AX8850, and AX650A, or Axera's dedicated AI accelerator.

With this SDK, you can:

  • Perform semantic image search by providing natural language queries.
  • Utilize CLIP to embed text queries and compare them against a pre-computed set of image embeddings.
  • Run all inference processes directly on Axera NPUs for low-latency, high-throughput performance at the edge.

This solution is well-suited for smart cameras, content filtering, AI-powered user interfaces, and other edge AI scenarios where natural language-based image retrieval is required.

References links:

For those who are interested in model conversion, you can try to export axmodel through

Support Platform

Performance

Model Input Shape Latency (ms) CMM Usage (MB)
cnclip_vit_l14_336px_vision_u16u8.axmodel 1 x 3 x 336 x 336 88.475 ms 304 MB
cnclip_vit_l14_336px_text_u16.axmodel 1 x 52 4.576 ms 122 MB

How to use

Download all files from this repository to the device

(base) axera@raspberrypi:~/samples/AXERA-TECH/libclip.axera $ tree -L 2
.
├── cnclip
│   ├── cnclip_vit_l14_336px_text_u16.axmodel
│   ├── cnclip_vit_l14_336px_vision_u16u8.axmodel
│   └── cn_vocab.txt
├── coco_1000.tar
├── config.json
├── gradio_01.png
├── install
│   ├── examples
│   ├── include
│   └── lib
├── pyclip
│   ├── example.py
│   ├── gradio_example.png
│   ├── gradio_example.py
│   ├── libclip.so
│   ├── __pycache__
│   ├── pyclip.py
│   └── requirements.txt
└── README.md

8 directories, 13 files

python env requirement

pip install -r pyclip/requirements.txt

Inference with AX650 Host, such as M4N-Dock(爱芯派Pro)

TODO

Inference with M.2 Accelerator card

What is M.2 Accelerator card?, Show this DEMO based on Raspberry PI 5.

(py312) axera@raspberrypi:~/samples/AXERA-TECH/libclip.axera $ export LD_PRELOAD=/usr/lib/aarch64-linux-gnu/libstdc++.so.6
(py312) axera@raspberrypi:~/samples/AXERA-TECH/libclip.axera $ cp install/lib/aarch64/libclip.so pyclip/
(py312) axera@raspberrypi:~/samples/AXERA-TECH/libclip.axera $ tar xf coco_1000.tar
(py312) axera@raspberrypi:~/samples/AXERA-TECH/libclip.axera $ python pyclip/gradio_example.py --ienc cnclip/cnclip_vit_l14_336px_vision_u16u8.axmodel --tenc cnclip/cnclip_vit_l14_336px_text_u16.axmodel --vocab cnclip/cn_vocab.txt --isCN 1 --db_path clip_feat_db_coco --image_folder coco_1000/
Trying to load: /home/axera/samples/AXERA-TECH/libclip.axera/pyclip/aarch64/libclip.so

❌ Failed to load: /home/axera/samples/AXERA-TECH/libclip.axera/pyclip/aarch64/libclip.so
   /home/axera/samples/AXERA-TECH/libclip.axera/pyclip/aarch64/libclip.so: cannot open shared object file: No such file or directory
🔍 File not found. Please verify that libclip.so exists and the path is correct.

Trying to load: /home/axera/samples/AXERA-TECH/libclip.axera/pyclip/libclip.so
open libax_sys.so failed
open libax_engine.so failed
✅ Successfully loaded: /home/axera/samples/AXERA-TECH/libclip.axera/pyclip/libclip.so
可用设备: {'host': {'available': True, 'version': '', 'mem_info': {'remain': 0, 'total': 0}}, 'devices': {'host_version': 'V3.6.2_20250603154858', 'dev_version': 'V3.6.2_20250603154858', 'count': 1, 'devices_info': [{'temp': 37, 'cpu_usage': 1, 'npu_usage': 0, 'mem_info': {'remain': 7022, 'total': 7040}}]}}
[I][                             run][  31]: AXCLWorker start with devid 0

input size: 1
    name:    image [unknown] [unknown]
        1 x 3 x 336 x 336


output size: 1
    name: unnorm_image_features
        1 x 768

[I][              load_image_encoder][  50]: nchw 336 336
[I][              load_image_encoder][  60]: image feature len 768

input size: 1
    name:     text [unknown] [unknown]
        1 x 52


output size: 1
    name: unnorm_text_features
        1 x 768

[I][               load_text_encoder][  44]: text feature len 768
[I][                  load_tokenizer][  60]: text token len 52
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [01:40<00:00,  9.93it/s]
* Running on local URL:  http://0.0.0.0:7860

If your Raspberry PI 5 IP Address is 192.168.1.100, so using this URL http://192.168.1.100:7860 with your WebApp.