Gabriel commited on
Commit
017d807
·
1 Parent(s): 034bb1e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -5
README.md CHANGED
@@ -21,6 +21,22 @@ The Swedish National Archives presents an end-to-end Handwritten Text Recognitio
21
 
22
  The models are designed to provide a generic pipeline for handwritten text recognition, offering robust performance for documents from the 16th to the 19th century.
23
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
24
  ## Intended Use
25
  The Swedish National Archives HTR pipeline is intended to be used for the following purposes:
26
 
@@ -53,15 +69,12 @@ The training data was annotated to provide ground truth for text region and line
53
 
54
  The data can be find here: (WIP will be added soon)
55
 
56
-
57
  ## Caveats and Future Work
58
  Although the Swedish National Archives HTR pipeline has been trained and optimized for running-text documents from the specified time period, there are a few caveats and considerations to keep in mind:
59
 
60
- - **Out-of-Scope Documents**: The pipeline may encounter difficulties when processing documents that deviate significantly from the expected document characteristics or handwriting styles present in the training data.
61
-
62
- - **Continuous Improvement**: The pipeline can benefit from continuous updates and improvements as new training data becomes available and advancements in OCR technology occur. Regular evaluations and updates are recommended to enhance its performance and adaptability.
63
 
64
- - **User Feedback**: Users are encouraged to provide feedback on the pipeline's performance, identify issues, and report any potential biases or limitations. This feedback can contribute to refining the pipeline and addressing concerns.
65
 
66
  ## References
67
  If you would like to learn more about the Swedish National Archives HTR pipeline or access the training data, please refer to the following resources:
 
21
 
22
  The models are designed to provide a generic pipeline for handwritten text recognition, offering robust performance for documents from the 16th to the 19th century.
23
 
24
+ ## Evaluation
25
+
26
+ The Swedish National Archives HTR pipeline has been evaluated using standard evaluation metrics for Handwritten Text Recognition. The Word Error Rate (WER) and Character Error Rate (CER) are commonly used to assess the accuracy of the pipeline.
27
+
28
+ The reported performance metrics are obtained on a test dataset that represents a diverse range of historical running-text documents from the 16th to the 19th century. It is important to note that the actual performance may vary depending on the specific documents and handwriting styles encountered in practical usage.
29
+
30
+ | Metric | Performance |
31
+ |--------|-------------|
32
+ | WER | XX% |
33
+ | CER | XX% |
34
+
35
+ The WER measures the percentage of incorrectly recognized words compared to the ground truth, while the CER measures the percentage of incorrectly recognized characters.
36
+
37
+ Regular evaluations are conducted to monitor and improve the performance of the pipeline. As new evaluation results become available, this table will be updated to reflect the most recent performance metrics.
38
+
39
+
40
  ## Intended Use
41
  The Swedish National Archives HTR pipeline is intended to be used for the following purposes:
42
 
 
69
 
70
  The data can be find here: (WIP will be added soon)
71
 
 
72
  ## Caveats and Future Work
73
  Although the Swedish National Archives HTR pipeline has been trained and optimized for running-text documents from the specified time period, there are a few caveats and considerations to keep in mind:
74
 
75
+ Continuous Improvement: The pipeline is continuously being updated and improved as new training data becomes available and advancements in OCR technology occur. With access to more training data, the models will be updated to further enhance their performance and adaptability.
 
 
76
 
77
+ User Feedback: Users are encouraged to provide feedback on the pipeline's performance, identify issues, and report any potential biases or limitations. This feedback is highly valuable in refining the pipeline, addressing concerns, and informing future updates.
78
 
79
  ## References
80
  If you would like to learn more about the Swedish National Archives HTR pipeline or access the training data, please refer to the following resources: