|
--- |
|
language: |
|
- en |
|
base_model: |
|
- ds4sd/docling-models |
|
pipeline_tag: object-detection |
|
--- |
|
# Docling Model for Layout |
|
|
|
This is the **Docling model for layout detection**, designed to facilitate easy importing and usage like any other Hugging Face model. |
|
|
|
This model is part of the [Docling repository](https://huggingface.co/ds4sd/docling-models), which provides document layout analysis tools. |
|
|
|
## **Usage Example** |
|
Here's how you can load and use the model: |
|
|
|
```python |
|
import torch |
|
from PIL import Image |
|
from transformers import RTDetrForObjectDetection, RTDetrImageProcessor |
|
|
|
# Load the model and processor |
|
image_processor = RTDetrImageProcessor.from_pretrained("HuggingPanda/docling-layout") |
|
model = RTDetrForObjectDetection.from_pretrained("HuggingPanda/docling-layout") |
|
|
|
# Load an image |
|
image = Image.open("hocr_output_page-0001.jpg") |
|
|
|
# Preprocess the image |
|
resize = {"height":640, "width":640} |
|
inputs = image_processor( |
|
images=image, |
|
return_tensors="pt", |
|
size=resize, |
|
) |
|
|
|
# Perform inference |
|
with torch.no_grad(): |
|
outputs = model(**inputs) |
|
|
|
# Post-process results |
|
results = image_processor.post_process_object_detection( |
|
outputs, |
|
target_sizes=torch.tensor([image.size[::-1]]), |
|
threshold=0.3 |
|
) |
|
|
|
# Print detected objects |
|
for result in results: |
|
for score, label_id, box in zip(result["scores"], result["labels"], result["boxes"]): |
|
score, label = score.item(), label_id.item() |
|
box = [round(i, 2) for i in box.tolist()] |
|
print(f"{model.config.id2label[label+1]}: {score:.2f} {box}") |
|
|
|
``` |
|
|
|
|
|
## **Model Information** |
|
- **Base Model:** RT-DETR (Robust Transformer-based Object Detector) |
|
- **Intended Use:** Layout detection for documents |
|
- **Framework:** [Hugging Face Transformers](https://huggingface.co/docs/transformers/index) |
|
- **Dataset Used:** Internal dataset for document structure recognition |
|
- **License:** Apache 2.0 |
|
|
|
## **Citing This Model** |
|
If you use this model in your work, please cite the main **Docling repository**: |
|
|
|
``` |
|
@misc{docling2024, title={Docling Models for Document Layout Analysis}, author={DS4SD Team}, year={2024}, howpublished={Hugging Face Repository}, url={https://huggingface.co/ds4sd/docling-models} } |
|
``` |
|
|
|
For more details, visit the main repo: [ds4sd/docling-models](https://huggingface.co/ds4sd/docling-models). |
|
|
|
## **Contact** |
|
For questions or issues, please open a discussion on Hugging Face or contact [[email protected]]. |