zeeshaan-ai commited on
Commit
3a320ad
·
verified ·
1 Parent(s): 4553093

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +145 -0
README.md ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - GetSoloTech/Code-Reasoning
5
+ base_model:
6
+ - Qwen/Qwen3-4B-Thinking-2507
7
+ pipeline_tag: text-generation
8
+ library_name: transformers
9
+ tags:
10
+ - code-generation
11
+ - competitive-programming
12
+ - code-reasoning
13
+ - programming
14
+ - algorithms
15
+ - problem-solving
16
+ ---
17
+
18
+ # GetSoloTech/Qwen3-Code-Reasoning-4B
19
+
20
+ A finetuned version of Qwen3-4B-Thinking-2507 specifically optimized for competitive programming and code reasoning tasks. This model has been trained on the high-quality [Code-Reasoning](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning) dataset to enhance its capabilities in solving complex programming problems with detailed reasoning.
21
+
22
+ ## 🎯 Model Overview
23
+
24
+ This model is a **LoRA-finetuned** version of [Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) with the following specifications:
25
+
26
+ - **Base Model**: Qwen3-4B-Thinking-2507 (4.0B parameters)
27
+ - **Training Method**: LoRA (Low-Rank Adaptation)
28
+ - **Training Dataset**: GetSoloTech/Code-Reasoning
29
+ - **Training Framework**: Unsloth with QLoRA
30
+ - **Context Length**: 4096 tokens (configurable up to 262,144)
31
+ - **Model Type**: Causal Language Model with Thinking Capabilities
32
+
33
+ ## 🚀 Key Features
34
+
35
+ - **Enhanced Code Reasoning**: Specifically trained on competitive programming problems
36
+ - **Thinking Capabilities**: Inherits the advanced reasoning capabilities from the base model
37
+ - **High-Quality Solutions**: Trained on solutions with ≥50% test case pass rates
38
+ - **Structured Output**: Optimized for generating well-reasoned programming solutions
39
+ - **Efficient Training**: Uses LoRA adapters for efficient parameter updates
40
+
41
+
42
+ ### Dataset Statistics
43
+ - **Split**: Python
44
+ - **Source**: High-quality competitive programming problems from TACO, APPS, CodeContests, and Codeforces
45
+ - **Quality Filter**: Only correctly solved problems with ≥50% test case pass rates
46
+
47
+ ## 🔧 Usage
48
+
49
+ ### Basic Inference
50
+
51
+ ```python
52
+ from transformers import AutoModelForCausalLM, AutoTokenizer
53
+
54
+ model_name = "your-username/Qwen3-4B-Thinking-Code-Reasoning"
55
+
56
+ # Load the tokenizer and model
57
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
58
+ model = AutoModelForCausalLM.from_pretrained(
59
+ model_name,
60
+ torch_dtype="auto",
61
+ device_map="auto"
62
+ )
63
+
64
+ # Prepare input for competitive programming problem
65
+ messages = [
66
+ {"role": "system", "content": "You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful."},
67
+ {"role": "user", "content": "Your programming problem here..."}
68
+ ]
69
+
70
+ text = tokenizer.apply_chat_template(
71
+ messages,
72
+ tokenize=False,
73
+ add_generation_prompt=True,
74
+ )
75
+
76
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
77
+
78
+ # Generate solution
79
+ generated_ids = model.generate(
80
+ **model_inputs,
81
+ max_new_tokens=4096,
82
+ temperature=0.7,
83
+ top_p=0.8,
84
+ top_k=20
85
+ )
86
+
87
+ output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
88
+ content = tokenizer.decode(output_ids, skip_special_tokens=True).strip("\n")
89
+ print(content)
90
+ ```
91
+
92
+ ## 📈 Performance Expectations
93
+
94
+ This finetuned model is expected to show improved performance on:
95
+
96
+ - **Competitive Programming Problems**: Better understanding of problem constraints and requirements
97
+ - **Code Generation**: More accurate and efficient solutions
98
+ - **Reasoning Quality**: Enhanced step-by-step reasoning for complex problems
99
+ - **Solution Completeness**: More comprehensive solutions with proper edge case handling
100
+
101
+ ## 🎛️ Recommended Settings
102
+
103
+ ### For Code Generation
104
+ - **Temperature**: 0.7
105
+ - **Top-p**: 0.8
106
+ - **Top-k**: 20
107
+ - **Max New Tokens**: 4096 (adjust based on problem complexity)
108
+
109
+ ### For Reasoning Tasks
110
+ - **Temperature**: 0.6
111
+ - **Top-p**: 0.95
112
+ - **Top-k**: 20
113
+ - **Max New Tokens**: 81920 (for complex reasoning)
114
+
115
+ ## 🔗 Related Resources
116
+
117
+ - **Base Model**: [Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507)
118
+ - **Training Dataset**: [Code-Reasoning](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning)
119
+ - **Training Framework**: [Unsloth](https://github.com/unslothai/unsloth)
120
+ - **Original Dataset**: [OpenCodeReasoning-2](https://huggingface.co/datasets/nvidia/OpenCodeReasoning-2)
121
+
122
+ ## 🤝 Contributing
123
+
124
+ This model was created using the Unsloth framework and the Code-Reasoning dataset. For questions about:
125
+ - The base model: [Qwen3 GitHub](https://github.com/QwenLM/Qwen3)
126
+ - The training dataset: [Code-Reasoning Repository](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning)
127
+ - The training framework: [Unsloth Documentation](https://docs.unsloth.ai/)
128
+
129
+ ## 📄 License
130
+
131
+ This model follows the same license as the base model (Apache 2.0). Please refer to the [base model license](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507/blob/main/LICENSE) for details.
132
+
133
+ ## 🙏 Acknowledgments
134
+
135
+ - **Qwen Team** for the excellent base model
136
+ - **Unsloth Team** for the efficient training framework
137
+ - **NVIDIA Research** for the original OpenCodeReasoning-2 dataset
138
+
139
+ ## 📞 Contact
140
+
141
+ For questions about this finetuned model, please open an issue in the repository.
142
+
143
+ ---
144
+
145
+ **Note**: This model is specifically optimized for competitive programming and code reasoning tasks.