Update README.md
Browse files
README.md
CHANGED
@@ -15,7 +15,7 @@ For each document, we calculated a combined educational quality score by taking
|
|
15 |
|
16 |
We trained Aleph-Alpha-GermanWeb-Quality-Classifier-fastText using 185,403 documents in each class. We used 95% of the data (and the remaining 5% for validation) to train a fastText model to classify between high and low quality text data. It reached 92% precision and 91.5% recall on the validation set.
|
17 |
|
18 |
-
Further details, including our LLM judging prompt, can be found in our
|
19 |
|
20 |
## Example Snippet
|
21 |
|
|
|
15 |
|
16 |
We trained Aleph-Alpha-GermanWeb-Quality-Classifier-fastText using 185,403 documents in each class. We used 95% of the data (and the remaining 5% for validation) to train a fastText model to classify between high and low quality text data. It reached 92% precision and 91.5% recall on the validation set.
|
17 |
|
18 |
+
Further details, including our LLM judging prompt, can be found in our accompanying paper (link to paper coming soon).
|
19 |
|
20 |
## Example Snippet
|
21 |
|