Update README.md
Browse files
README.md
CHANGED
|
@@ -6,7 +6,10 @@ library_name: fasttext
|
|
| 6 |
pipeline_tag: text-classification
|
| 7 |
---
|
| 8 |
# llm-data-textbook-quality-fasttext-classifer-v2
|
| 9 |
-
|
|
|
|
|
|
|
|
|
|
| 10 |
Model is built on fasttext - it can classify more than 2000 examples per second in CPU, and so I can be used **on-the-fly**.
|
| 11 |
This model can classify if a text has high educational value (more explicitly defined then textbook quality). This definition change is a substantial change vs [kenhktsui/llm-data-textbook-quality-fasttext-classifer-v1](https://huggingface.co/kenhktsui/llm-data-textbook-quality-fasttext-classifer-v1).
|
| 12 |
It can be used as a filter for data curation when training a LLM.
|
|
|
|
| 6 |
pipeline_tag: text-classification
|
| 7 |
---
|
| 8 |
# llm-data-textbook-quality-fasttext-classifer-v2
|
| 9 |
+
|
| 10 |
+

|
| 11 |
+
|
| 12 |
+
This educational value classifier is deeply inspired by [Textbooks Are All You Need](https://arxiv.org/abs/2306.11644), where a classifier was developed to predict the educational value of data, and was then used for data filtering.
|
| 13 |
Model is built on fasttext - it can classify more than 2000 examples per second in CPU, and so I can be used **on-the-fly**.
|
| 14 |
This model can classify if a text has high educational value (more explicitly defined then textbook quality). This definition change is a substantial change vs [kenhktsui/llm-data-textbook-quality-fasttext-classifer-v1](https://huggingface.co/kenhktsui/llm-data-textbook-quality-fasttext-classifer-v1).
|
| 15 |
It can be used as a filter for data curation when training a LLM.
|