File size: 5,181 Bytes
3a320ad
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dead38d
3a320ad
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
86d1b4b
3a320ad
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dead38d
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
license: apache-2.0
datasets:
- GetSoloTech/Code-Reasoning
base_model:
- Qwen/Qwen3-4B-Thinking-2507
pipeline_tag: text-generation
library_name: transformers
tags:
- code-generation
- competitive-programming
- code-reasoning
- programming
- algorithms
- problem-solving
- python
---

# GetSoloTech/Qwen3-Code-Reasoning-4B

A finetuned version of Qwen3-4B-Thinking-2507 specifically optimized for competitive programming and code reasoning tasks. This model has been trained on the high-quality [Code-Reasoning](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning) dataset to enhance its capabilities in solving complex programming problems with detailed reasoning.

## 🎯 Model Overview

This model is a **LoRA-finetuned** version of [Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) with the following specifications:

- **Base Model**: Qwen3-4B-Thinking-2507 (4.0B parameters)
- **Training Method**: LoRA (Low-Rank Adaptation)
- **Training Dataset**: GetSoloTech/Code-Reasoning
- **Training Framework**: Unsloth with QLoRA
- **Context Length**: 4096 tokens (configurable up to 262,144)
- **Model Type**: Causal Language Model with Thinking Capabilities

## 🚀 Key Features

- **Enhanced Code Reasoning**: Specifically trained on competitive programming problems
- **Thinking Capabilities**: Inherits the advanced reasoning capabilities from the base model
- **High-Quality Solutions**: Trained on solutions with ≥50% test case pass rates
- **Structured Output**: Optimized for generating well-reasoned programming solutions
- **Efficient Training**: Uses LoRA adapters for efficient parameter updates


### Dataset Statistics
- **Split**: Python
- **Source**: High-quality competitive programming problems from TACO, APPS, CodeContests, and Codeforces
- **Quality Filter**: Only correctly solved problems with ≥50% test case pass rates

## 🔧 Usage

### Basic Inference

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "GetSoloTech/Qwen3-Code-Reasoning-4B"

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

# Prepare input for competitive programming problem
messages = [
    {"role": "system", "content": "You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful."},
    {"role": "user", "content": "Your programming problem here..."}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Generate solution
generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=4096,
    temperature=0.7,
    top_p=0.8,
    top_k=20
)

output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
content = tokenizer.decode(output_ids, skip_special_tokens=True).strip("\n")
print(content)
```

## 📈 Performance Expectations

This finetuned model is expected to show improved performance on:

- **Competitive Programming Problems**: Better understanding of problem constraints and requirements
- **Code Generation**: More accurate and efficient solutions
- **Reasoning Quality**: Enhanced step-by-step reasoning for complex problems
- **Solution Completeness**: More comprehensive solutions with proper edge case handling

## 🎛️ Recommended Settings

### For Code Generation
- **Temperature**: 0.7
- **Top-p**: 0.8
- **Top-k**: 20
- **Max New Tokens**: 4096 (adjust based on problem complexity)

### For Reasoning Tasks
- **Temperature**: 0.6
- **Top-p**: 0.95
- **Top-k**: 20
- **Max New Tokens**: 81920 (for complex reasoning)

## 🔗 Related Resources

- **Base Model**: [Qwen3-4B-Thinking-2507](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507)
- **Training Dataset**: [Code-Reasoning](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning)
- **Training Framework**: [Unsloth](https://github.com/unslothai/unsloth)
- **Original Dataset**: [OpenCodeReasoning-2](https://huggingface.co/datasets/nvidia/OpenCodeReasoning-2)

## 🤝 Contributing

This model was created using the Unsloth framework and the Code-Reasoning dataset. For questions about:
- The base model: [Qwen3 GitHub](https://github.com/QwenLM/Qwen3)
- The training dataset: [Code-Reasoning Repository](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning)
- The training framework: [Unsloth Documentation](https://docs.unsloth.ai/)

## 📄 License

This model follows the same license as the base model (Apache 2.0). Please refer to the [base model license](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507/blob/main/LICENSE) for details.

## 🙏 Acknowledgments

- **Qwen Team** for the excellent base model
- **Unsloth Team** for the efficient training framework
- **NVIDIA Research** for the original OpenCodeReasoning-2 dataset

## 📞 Contact

For questions about this finetuned model, please open an issue in the repository.

---

**Note**: This model is specifically optimized for competitive programming and code reasoning tasks.