Improve model card: Add `library_name`, explicit paper and code links (#3)

Browse files

- Improve model card: Add `library_name`, explicit paper and code links (6526c2455757fa91b7aae2c0aaa2783d627be85e)

Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show

README.md +10 -4

README.md CHANGED Viewed

@@ -1,9 +1,16 @@
 ---
-license: apache-2.0
 language:
 - en
 pipeline_tag: text-generation
 ---
 ## Introduction
 <p align="center">
@@ -42,7 +49,7 @@ All models are evaluated in non-thinking mode.
 | Qwen3 0.6B                                    | 0.6                 | 148.56   | 94.91      | 45.93        | 15.29  | 27.44          | 13.32  | 9.76   |
 | Qwen3 1.7B                                    | 1.3                 | 62.24    | 41.00      | 20.29        | 6.09   | 11.08          | 6.35   | 4.15   |
 | Qwen3 1.7B+limited memory                     | limit 1G            | 2.66     | 1.09       | 1.00         | 0.47   | -              | -      | 0.11   |
-| Gemma3n E2B                                   | 1G, theoretically   | 36.88    | 27.06      | 12.50        | 3.80   | 6.66           | 3.46   | 2.45   |
 Note: i9 14900, 1+13 8ge4 use 4 threads, others use the number of threads that can achieve the maximum speed. All models here have been quantized to q4_0.
@@ -115,5 +122,4 @@ from modelscope import AutoModelForCausalLM, AutoTokenizer
 ## Statement
 - Due to the constraints of its model size and the limitations of its training data, its responses may contain factual inaccuracies, biases, or outdated information.
 - Users bear full responsibility for independently evaluating and verifying the accuracy and appropriateness of all generated content.
-- SmallThinker does not possess genuine comprehension or consciousness and cannot express personal opinions or value judgments.

 ---
 language:
 - en
+license: apache-2.0
 pipeline_tag: text-generation
+library_name: transformers
 ---
+# SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment
+**Paper**: [SmallThinker: A Family of Efficient Large Language Models Natively Trained for Local Deployment](https://huggingface.co/papers/2507.20984)
+**Code**: [https://github.com/SJTU-IPADS/SmallThinker](https://github.com/SJTU-IPADS/SmallThinker)
 ## Introduction
 <p align="center">
 | Qwen3 0.6B                                    | 0.6                 | 148.56   | 94.91      | 45.93        | 15.29  | 27.44          | 13.32  | 9.76   |
 | Qwen3 1.7B                                    | 1.3                 | 62.24    | 41.00      | 20.29        | 6.09   | 11.08          | 6.35   | 4.15   |
 | Qwen3 1.7B+limited memory                     | limit 1G            | 2.66     | 1.09       | 1.00         | 0.47   | -              | -      | 0.11   |
+| Gemma3n E2B                                   | 1G, theoretically   | 36.88    | 27.06      | 12.50        | 3.80   | 6.66           | 3.80   | 2.45   |
 Note: i9 14900, 1+13 8ge4 use 4 threads, others use the number of threads that can achieve the maximum speed. All models here have been quantized to q4_0.
 ## Statement
 - Due to the constraints of its model size and the limitations of its training data, its responses may contain factual inaccuracies, biases, or outdated information.
 - Users bear full responsibility for independently evaluating and verifying the accuracy and appropriateness of all generated content.
+- SmallThinker does not possess genuine comprehension or consciousness and cannot express personal opinions or value judgments.