jakep-allenai commited on
Commit
d6eedb5
·
verified ·
1 Parent(s): a626e89

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -3
README.md CHANGED
@@ -1,3 +1,43 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
+ datasets:
6
+ - allenai/olmOCR-mix-0225
7
+ base_model:
8
+ - Qwen/Qwen2.5-VL-7B-Instruct
9
+ library_name: transformers
10
+ ---
11
+
12
+ <img alt="olmOCR Logo" src="https://huggingface.co/datasets/allenai/blog-images/resolve/main/olmocr/olmocr.png" width="242px" style="margin-left:'auto' margin-right:'auto' display:'block'">
13
+
14
+ # olmOCR-7B-0725-FP8
15
+
16
+ Quantized to FP8 Version of [olmOCR-7B-0825](https://huggingface.co/allenai/olmOCR-7B-0825), using llmcompressor.
17
+
18
+ This is a release of the olmOCR model that's fine tuned from Qwen2.5-VL-7B-Instruct using the
19
+ [olmOCR-mix-0225](https://huggingface.co/datasets/allenai/olmOCR-mix-0225) dataset.
20
+
21
+ Quick links:
22
+ - 📃 [Paper](https://olmocr.allenai.org/papers/olmocr.pdf)
23
+ - 🤗 [Dataset](https://huggingface.co/datasets/allenai/olmOCR-mix-0225)
24
+ - 🛠️ [Code](https://github.com/allenai/olmocr)
25
+ - 🎮 [Demo](https://olmocr.allenai.org/)
26
+
27
+ The best way to use this model is via the [olmOCR toolkit](https://github.com/allenai/olmocr).
28
+ The toolkit comes with an efficient inference setup via sglang that can handle millions of documents
29
+ at scale.
30
+
31
+ ## Usage
32
+
33
+ This model expects as input a single document image, rendered such that the longest dimension is 1288 pixels.
34
+
35
+ The prompt must then contain the additional metadata from the document, and the easiest way to generate this
36
+ is to use the methods provided by the [olmOCR toolkit](https://github.com/allenai/olmocr).
37
+
38
+
39
+ ## License and use
40
+
41
+ olmOCR is licensed under the Apache 2.0 license.
42
+ olmOCR is intended for research and educational use.
43
+ For more information, please see our [Responsible Use Guidelines](https://allenai.org/responsible-use).