prithivMLmods commited on
Commit
ff80302
·
verified ·
1 Parent(s): 106ad94

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -21
README.md CHANGED
@@ -17,9 +17,9 @@ tags:
17
  - code
18
  ---
19
 
20
- # **rStar-Coder-Qwen3(exp)**
21
 
22
- > *rStar-Coder-Qwen3* is a high-efficiency, multi-domain model fine-tuned on *Qwen3-4B* using the **rStar-Coder** dataset enhanced with **code, math, and science expert clusters** and an extended **open code reasoning dataset**. This model blends symbolic precision, scientific logic, and structured output fluency—making it an ideal tool for developers, educators, and researchers seeking advanced reasoning under constrained compute.
23
 
24
  > \[!note]
25
  > GGUF: [https://huggingface.co/prithivMLmods/rStar-Coder-Qwen3-GGUF](https://huggingface.co/prithivMLmods/rStar-Coder-Qwen3-GGUF)
@@ -43,11 +43,26 @@ tags:
43
  5. **Structured Output Mastery**
44
  Seamlessly generates output in **LaTeX**, **Markdown**, **JSON**, **CSV**, and **YAML**, suited for research reports, technical documentation, and data formats.
45
 
46
- 6. **Optimized 4B Footprint for Versatile Deployment**
47
  Strikes a balance between performance and efficiency, making it deployable on **mid-range GPUs**, **offline clusters**, and advanced **edge AI systems**.
48
 
49
  ---
50
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
51
  ## **Quickstart with Transformers**
52
 
53
  ```python
@@ -91,21 +106,6 @@ print(response)
91
 
92
  ---
93
 
94
- ## **Dataset Seed**
95
-
96
- ```python
97
- from datasets import load_dataset
98
-
99
- # Load the reasoning dataset
100
- reasoning_dataset = load_dataset(
101
- "microsoft/rStar-Coder",
102
- data_files="seed_sft/data-00001-of-00020.parquet",
103
- split="train"
104
- )
105
- ```
106
-
107
- ---
108
-
109
  ## **Intended Use**
110
 
111
  * Scientific tutoring, computational logic, and mathematical education
@@ -121,10 +121,8 @@ reasoning_dataset = load_dataset(
121
  * Specialized in technical and symbolic tasks—general chat may underperform
122
  * Prioritizes structured reasoning over emotional or casual tone generation
123
 
124
- ---
125
-
126
  ## **References**
127
 
128
  1. [Qwen2.5 Technical Report (2024)](https://arxiv.org/pdf/2412.15115)
129
  2. [YaRN: Efficient Context Window Extension of Large Language Models](https://arxiv.org/pdf/2309.00071)
130
- 3. [rStar-Coder Dataset](https://huggingface.co/datasets/microsoft/rStar-Coder)
 
17
  - code
18
  ---
19
 
20
+ # **rStar-Coder-Qwen3**
21
 
22
+ > rStar-Coder-Qwen3 is a high-efficiency, multi-domain model fine-tuned on **Qwen-0.6B** using the **rStar-Coder** dataset enhanced with **code expert clusters** and an extended **open code reasoning dataset**. This model blends symbolic precision, scientific logic, and structured output fluency—making it an ideal tool for developers, educators, and researchers seeking advanced reasoning under constrained compute.
23
 
24
  > \[!note]
25
  > GGUF: [https://huggingface.co/prithivMLmods/rStar-Coder-Qwen3-GGUF](https://huggingface.co/prithivMLmods/rStar-Coder-Qwen3-GGUF)
 
43
  5. **Structured Output Mastery**
44
  Seamlessly generates output in **LaTeX**, **Markdown**, **JSON**, **CSV**, and **YAML**, suited for research reports, technical documentation, and data formats.
45
 
46
+ 6. **Optimized Lightweight Footprint for Versatile Deployment**
47
  Strikes a balance between performance and efficiency, making it deployable on **mid-range GPUs**, **offline clusters**, and advanced **edge AI systems**.
48
 
49
  ---
50
 
51
+ ## Dataset Seed
52
+
53
+ ```python
54
+ from datasets import load_dataset
55
+
56
+ # Load the reasoning dataset
57
+ reasoning_dataset = load_dataset(
58
+ "microsoft/rStar-Coder",
59
+ data_files="seed_sft/data-00001-of-00020.parquet",
60
+ split="train"
61
+ )
62
+ ```
63
+
64
+ ---
65
+
66
  ## **Quickstart with Transformers**
67
 
68
  ```python
 
106
 
107
  ---
108
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
109
  ## **Intended Use**
110
 
111
  * Scientific tutoring, computational logic, and mathematical education
 
121
  * Specialized in technical and symbolic tasks—general chat may underperform
122
  * Prioritizes structured reasoning over emotional or casual tone generation
123
 
 
 
124
  ## **References**
125
 
126
  1. [Qwen2.5 Technical Report (2024)](https://arxiv.org/pdf/2412.15115)
127
  2. [YaRN: Efficient Context Window Extension of Large Language Models](https://arxiv.org/pdf/2309.00071)
128
+ 3. [microsoft/rStar-Coder Dataset](https://huggingface.co/datasets/microsoft/rStar-Coder)