Riksarkivet
/

HTR_pipeline_models

Image-to-Text

Swedish

HTR

Model card Files Files and versions

xet

Community

Gabriel commited on Jul 11, 2023

Commit

017d807

1 Parent(s): 034bb1e

Update README.md

Browse files

Files changed (1) hide show

README.md +18 -5

README.md CHANGED Viewed

@@ -21,6 +21,22 @@ The Swedish National Archives presents an end-to-end Handwritten Text Recognitio
 The models are designed to provide a generic pipeline for handwritten text recognition, offering robust performance for documents from the 16th to the 19th century.
 ## Intended Use
 The Swedish National Archives HTR pipeline is intended to be used for the following purposes:
@@ -53,15 +69,12 @@ The training data was annotated to provide ground truth for text region and line
 The data can be find here: (WIP will be added soon)
 ## Caveats and Future Work
 Although the Swedish National Archives HTR pipeline has been trained and optimized for running-text documents from the specified time period, there are a few caveats and considerations to keep in mind:
-- **Out-of-Scope Documents**: The pipeline may encounter difficulties when processing documents that deviate significantly from the expected document characteristics or handwriting styles present in the training data.
-- **Continuous Improvement**: The pipeline can benefit from continuous updates and improvements as new training data becomes available and advancements in OCR technology occur. Regular evaluations and updates are recommended to enhance its performance and adaptability.
-- **User Feedback**: Users are encouraged to provide feedback on the pipeline's performance, identify issues, and report any potential biases or limitations. This feedback can contribute to refining the pipeline and addressing concerns.
 ## References
 If you would like to learn more about the Swedish National Archives HTR pipeline or access the training data, please refer to the following resources:

 The models are designed to provide a generic pipeline for handwritten text recognition, offering robust performance for documents from the 16th to the 19th century.
+## Evaluation
+The Swedish National Archives HTR pipeline has been evaluated using standard evaluation metrics for Handwritten Text Recognition. The Word Error Rate (WER) and Character Error Rate (CER) are commonly used to assess the accuracy of the pipeline.
+The reported performance metrics are obtained on a test dataset that represents a diverse range of historical running-text documents from the 16th to the 19th century. It is important to note that the actual performance may vary depending on the specific documents and handwriting styles encountered in practical usage.
+| Metric | Performance |
+|--------|-------------|
+| WER    | XX%         |
+| CER    | XX%         |
+The WER measures the percentage of incorrectly recognized words compared to the ground truth, while the CER measures the percentage of incorrectly recognized characters.
+Regular evaluations are conducted to monitor and improve the performance of the pipeline. As new evaluation results become available, this table will be updated to reflect the most recent performance metrics.
 ## Intended Use
 The Swedish National Archives HTR pipeline is intended to be used for the following purposes:
 The data can be find here: (WIP will be added soon)
 ## Caveats and Future Work
 Although the Swedish National Archives HTR pipeline has been trained and optimized for running-text documents from the specified time period, there are a few caveats and considerations to keep in mind:
+Continuous Improvement: The pipeline is continuously being updated and improved as new training data becomes available and advancements in OCR technology occur. With access to more training data, the models will be updated to further enhance their performance and adaptability.
+User Feedback: Users are encouraged to provide feedback on the pipeline's performance, identify issues, and report any potential biases or limitations. This feedback is highly valuable in refining the pipeline, addressing concerns, and informing future updates.
 ## References
 If you would like to learn more about the Swedish National Archives HTR pipeline or access the training data, please refer to the following resources: