YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

PaddleOCR v4 (PP-OCRv4)

Model Description

PP-OCRv4 is the fourth-generation end-to-end optical character recognition system from the PaddlePaddle team.
It combines a lightweight text detection → angle classification → text recognition pipeline with improved training techniques and data augmentation, delivering higher accuracy and robustness while staying efficient for real-time use.

PP-OCRv4 supports multilingual OCR (Latin and non-Latin scripts), irregular layouts (rotated/curved text), and challenging inputs such as noisy or low-resolution images often found in mobile and document-scan scenarios.

Features

  • End-to-end OCR: text detection, optional angle classification, and text recognition in one pipeline.
  • Multilingual support: pretrained models for English, Chinese, and dozens of other languages; easy finetuning for domain text.
  • Robust in real-world conditions: handles rotation, perspective distortion, blur, low light, and complex backgrounds.
  • Lightweight & fast: practical for both mobile apps and large-scale server deployments.
  • Flexible I/O: works with photos, scans, screenshots, receipts, invoices, ID cards, dashboards, and UI text.
  • Extensible: swap components (detector/recognizer), add language packs, or finetune on domain datasets.

Use Cases

  • Document digitization (invoices, receipts, forms, contracts)
  • RPA and back-office automation (screen/OCR flows)
  • Mobile scanning apps and camera-based translation/read-aloud
  • Industrial and retail analytics (labels, price tags, shelf tags)
  • Accessibility (screen-readers and read-aloud applications)

Inputs and Outputs

Input: Image (photo, scan, or screenshot).
Output: A list of detected text regions, each with:

  • bounding box (rectangular or polygonal)
  • recognized text string
  • optional confidence score and orientation

How to use

⚠️ Hardware requirement: the model currently runs only on Qualcomm NPUs (e.g., Snapdragon-powered AIPC).
Apple NPU support is planned next.

1) Install Nexa-SDK

  • Download and follow the steps under "Deploy Section" Nexa's model page: Download Windows arm64 SDK
  • (Other platforms coming soon)

2) Get an access token

Create a token in the Model Hub, then log in:

nexa config set license '<access_token>'

3) Run the model

Running:

nexa infer NexaAI/paddleocr-npu

License

References

Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including NexaAI/paddleocr-npu