README.md · loay/Arabic-OCR-Qwen2.5-VL-7B-Vision at main

Update README.md

7e2f762 verified 4 months ago

1.23 kB

	---
	license: apache-2.0
	language:
	- en
	- ar
	library_name: transformers
	tags:
	- unsloth
	- qwen
	- qwen2.5-vl
	- arabic
	- ocr
	- vision
	- text-extraction
	- merged
	- lora
	pipeline_tag: image-to-text
	base_model: unsloth/Qwen2.5-VL-7B-Instruct-bnb-4bit
	---

	# ArabicOCR-Qwen2.5-VL-7B-Vision

	This repository contains the `float16` merged version of a Vision-Language Model (VLM), fine-tuned by loay for the specific task of performing Optical Character Recognition (OCR) on Arabic text from images.

	The model was created by fine-tuning the `unsloth/Qwen2.5-VL-7B-Instruct-bnb-4bit` model using LoRA adapters. The high-performance training was made possible by the Unsloth library, and the adapters were then merged back into the base model for easy deployment.

	## Model Details

	- Fine-tuned by: [loay](https://huggingface.co/loay)
	- Base Model: `unsloth/Qwen2.5-VL-7B-Instruct-bnb-4bit`
	- Fine-tuning Task: Arabic Optical Character Recognition (OCR)
	- Training Data: The model was trained on a curated dataset of images containing Arabic text and their corresponding transcriptions.
	- Output Format: This is a `float16` precision model, ideal for inference on GPUs with sufficient VRAM (requires >14GB).