Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,36 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# PaddleOCR v4 (PP-OCRv4)
|
2 |
+
|
3 |
+
## Model Description
|
4 |
+
**PP-OCRv4** is the fourth-generation end-to-end optical character recognition system from the PaddlePaddle team.
|
5 |
+
It combines a lightweight **text detection → angle classification → text recognition** pipeline with improved training techniques and data augmentation, delivering higher accuracy and robustness while staying efficient for real-time use.
|
6 |
+
|
7 |
+
PP-OCRv4 supports multilingual OCR (Latin and non-Latin scripts), irregular layouts (rotated/curved text), and challenging inputs such as noisy or low-resolution images often found in mobile and document-scan scenarios.
|
8 |
+
|
9 |
+
## Features
|
10 |
+
- **End-to-end OCR**: text detection, optional angle classification, and text recognition in one pipeline.
|
11 |
+
- **Multilingual support**: pretrained models for English, Chinese, and dozens of other languages; easy finetuning for domain text.
|
12 |
+
- **Robust in real-world conditions**: handles rotation, perspective distortion, blur, low light, and complex backgrounds.
|
13 |
+
- **Lightweight & fast**: practical for both mobile apps and large-scale server deployments.
|
14 |
+
- **Flexible I/O**: works with photos, scans, screenshots, receipts, invoices, ID cards, dashboards, and UI text.
|
15 |
+
- **Extensible**: swap components (detector/recognizer), add language packs, or finetune on domain datasets.
|
16 |
+
|
17 |
+
## Use Cases
|
18 |
+
- Document digitization (invoices, receipts, forms, contracts)
|
19 |
+
- RPA and back-office automation (screen/OCR flows)
|
20 |
+
- Mobile scanning apps and camera-based translation/read-aloud
|
21 |
+
- Industrial and retail analytics (labels, price tags, shelf tags)
|
22 |
+
- Accessibility (screen-readers and read-aloud applications)
|
23 |
+
|
24 |
+
## Inputs and Outputs
|
25 |
+
**Input**: Image (photo, scan, or screenshot).
|
26 |
+
**Output**: A list of detected text regions, each with:
|
27 |
+
- bounding box (rectangular or polygonal)
|
28 |
+
- recognized text string
|
29 |
+
- optional confidence score and orientation
|
30 |
+
|
31 |
+
## License
|
32 |
+
- Licensed under [Apache-2.0](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/LICENSE)
|
33 |
+
|
34 |
+
## References
|
35 |
+
- GitHub repo: [https://github.com/PaddlePaddle/PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR)
|
36 |
+
- Model zoo & documentation: [Models list](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.7/doc/doc_en/models_list_en.md)
|