zeeshaan-ai commited on
Commit
198b23c
·
verified ·
1 Parent(s): 8107000

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +134 -0
README.md ADDED
@@ -0,0 +1,134 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - GetSoloTech/Code-Reasoning
4
+ base_model:
5
+ - google/gemma-3-4b-it
6
+ pipeline_tag: text-generation
7
+ library_name: transformers
8
+ tags:
9
+ - code-generation
10
+ - competitive-programming
11
+ - code-reasoning
12
+ - programming
13
+ - algorithms
14
+ - problem-solving
15
+ ---
16
+
17
+ # GetSoloTech/Gemma3-Code-Reasoning-4B
18
+
19
+ A finetuned version of google/gemma-3-4b-it specifically optimized for competitive programming and code reasoning tasks. This model has been trained on the high-quality [Code-Reasoning](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning) dataset to enhance its capabilities in solving complex programming problems with detailed reasoning.
20
+
21
+ ## 🎯 Model Overview
22
+
23
+ This model is a **LoRA-finetuned** version of [gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it) with the following specifications:
24
+
25
+ - **Base Model**: gemma-3-4b-it (4.0B parameters)
26
+ - **Training Method**: LoRA (Low-Rank Adaptation)
27
+ - **Training Dataset**: GetSoloTech/Code-Reasoning
28
+ - **Training Framework**: Unsloth with QLoRA
29
+ - **Context Length**: 4096 tokens
30
+ - **Model Type**: Causal Language Model with Thinking Capabilities
31
+
32
+ ## 🚀 Key Features
33
+
34
+ - **Enhanced Code Reasoning**: Specifically trained on competitive programming problems
35
+ - **Thinking Capabilities**: Inherits the advanced reasoning capabilities from the base model
36
+ - **High-Quality Solutions**: Trained on solutions with ≥50% test case pass rates
37
+ - **Structured Output**: Optimized for generating well-reasoned programming solutions
38
+ - **Efficient Training**: Uses LoRA adapters for efficient parameter updates
39
+
40
+
41
+ ### Dataset Statistics
42
+ - **Split**: Python
43
+ - **Source**: High-quality competitive programming problems from TACO, APPS, CodeContests, and Codeforces
44
+ - **Quality Filter**: Only correctly solved problems with ≥85% test case pass rates
45
+
46
+ ## 🔧 Usage
47
+
48
+ ### Basic Inference
49
+
50
+ ```python
51
+ from transformers import AutoModelForCausalLM, AutoTokenizer
52
+
53
+ model_name = "GetSoloTech/Gemma3-Code-Reasoning-4B"
54
+
55
+ # Load the tokenizer and model
56
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
57
+ model = AutoModelForCausalLM.from_pretrained(
58
+ model_name,
59
+ torch_dtype="auto",
60
+ device_map="auto"
61
+ )
62
+
63
+ # Prepare input for competitive programming problem
64
+ messages = [
65
+ {"role": "system", "content": "You are an expert competitive programmer. Read the problem and produce a correct, efficient solution. Include reasoning if helpful."},
66
+ {"role": "user", "content": "Your programming problem here..."}
67
+ ]
68
+
69
+ text = tokenizer.apply_chat_template(
70
+ messages,
71
+ tokenize=False,
72
+ add_generation_prompt=True,
73
+ )
74
+
75
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
76
+
77
+ # Generate solution
78
+ generated_ids = model.generate(
79
+ **model_inputs,
80
+ max_new_tokens=4096,
81
+ temperature=1.0,
82
+ top_p=0.95,
83
+ top_k=64
84
+ )
85
+
86
+ output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
87
+ content = tokenizer.decode(output_ids, skip_special_tokens=True).strip("\n")
88
+ print(content)
89
+ ```
90
+
91
+ ## 📈 Performance Expectations
92
+
93
+ This finetuned model is expected to show improved performance on:
94
+
95
+ - **Competitive Programming Problems**: Better understanding of problem constraints and requirements
96
+ - **Code Generation**: More accurate and efficient solutions
97
+ - **Reasoning Quality**: Enhanced step-by-step reasoning for complex problems
98
+ - **Solution Completeness**: More comprehensive solutions with proper edge case handling
99
+
100
+ ## 🎛️ Recommended Settings
101
+
102
+ - **Temperature**: 1.0
103
+ - **Top-p**: 0.95
104
+ - **Top-k**: 64
105
+ - **Max New Tokens**: 4096 (adjust based on problem complexity)
106
+
107
+ ## 🔗 Related Resources
108
+
109
+ - **Base Model**: [gemma-3-4b-it](https://huggingface.co/google/gemma-3-4b-it)
110
+ - **Training Dataset**: [Code-Reasoning](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning)
111
+ - **Training Framework**: [Unsloth](https://github.com/unslothai/unsloth)
112
+ - **Original Dataset**: [OpenCodeReasoning-2](https://huggingface.co/datasets/nvidia/OpenCodeReasoning-2)
113
+
114
+ ## 🤝 Contributing
115
+
116
+ This model was created using the Unsloth framework and the Code-Reasoning dataset. For questions about:
117
+ - The base model: [Gemma3 Huggingface](https://huggingface.co/google/gemma-3-4b-it)
118
+ - The training dataset: [Code-Reasoning Repository](https://huggingface.co/datasets/GetSoloTech/Code-Reasoning)
119
+ - The training framework: [Unsloth Documentation](https://docs.unsloth.ai/)
120
+
121
+
122
+ ## 🙏 Acknowledgments
123
+
124
+ - **Gemma Team** for the excellent base model
125
+ - **Unsloth Team** for the efficient training framework
126
+ - **NVIDIA Research** for the original OpenCodeReasoning-2 dataset
127
+
128
+ ## 📞 Contact
129
+
130
+ For questions about this finetuned model, please open an issue in the repository.
131
+
132
+ ---
133
+
134
+ **Note**: This model is specifically optimized for competitive programming and code reasoning tasks.