Gemma-2-2b Fine-tuned for Competitive Programming
This model is a fine-tuned version of google/gemma-2-2b-it on the open-r1/codeforces-cots dataset for competitive programming problem solving.
Model Details
Model Description
This model has been fine-tuned using LoRA (Low-Rank Adaptation) on competitive programming problems from Codeforces. It's designed to help generate solutions for algorithmic and data structure problems commonly found in competitive programming contests.
- Developed by: Aswith77
- Model type: Causal Language Model (Code Generation)
- Language(s): Python, C++, Java (primarily Python)
- License: MIT
- Finetuned from model: google/gemma-2-2b-it
- Fine-tuning method: LoRA (Low-Rank Adaptation)
Model Sources
- Repository: Hugging Face Model
- Base Model: google/gemma-2-2b-it
- Dataset: open-r1/codeforces-cots
Uses
Direct Use
This model is intended for generating solutions to competitive programming problems, particularly those similar to Codeforces problems. It can:
- Generate algorithmic solutions for given problem statements
- Help with code completion for competitive programming
- Assist in learning algorithmic problem-solving patterns
Downstream Use
The model can be further fine-tuned on:
- Specific programming languages
- Domain-specific algorithmic problems
- Educational coding platforms
Out-of-Scope Use
This model should not be used for:
- Production code without thorough testing
- Security-critical applications
- General-purpose software development without validation
- Problems requiring real-world system design
How to Get Started with the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load the base model
base_model = AutoModelForCausalLM.from_pretrained(
"google/gemma-2-2b-it",
torch_dtype=torch.float16,
device_map="auto"
)
# Load the fine-tuned LoRA adapters
model = PeftModel.from_pretrained(
base_model,
"Aswith77/gemma-2-2b-it-finetune-codeforces-cots"
)
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")
# Generate code for a problem
problem = """
Given an array of integers, find the maximum sum of a contiguous subarray.
Input: [-2,1,-3,4,-1,2,1,-5,4]
Output: 6 (subarray [4,-1,2,1])
"""
inputs = tokenizer(problem, return_tensors="pt")
outputs = model.generate(
**inputs,
max_length=512,
temperature=0.7,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
solution = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(solution)
Training Details
Training Data
The model was trained on the open-r1/codeforces-cots dataset, specifically using 1,000 competitive programming problems and their solutions from Codeforces.
Training Procedure
Training Hyperparameters
- Training regime: fp16 mixed precision
- Learning rate: 2e-4
- Batch size: 1 (per device)
- Gradient accumulation steps: 2
- Max steps: 100
- Warmup steps: 5
- Optimizer: AdamW 8-bit
- Weight decay: 0.01
- LoRA rank (r): 16
- LoRA alpha: 32
- LoRA dropout: 0.1
- Target modules: q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj
Speeds, Sizes, Times
- Training time: ~20 minutes
- Hardware: Tesla T4 GPU (16GB)
- Model size: ~30MB (LoRA adapters only)
- Final training loss: 0.3715
- Training samples per second: 0.338
Evaluation
The model achieved a training loss of 0.3715, showing good convergence from an initial loss of 0.9303. The loss curve demonstrated steady improvement throughout training without signs of overfitting.
Training Loss Progression
- Initial loss: 0.9303
- Final loss: 0.3715
- Loss reduction: ~60%
Bias, Risks, and Limitations
Limitations
- Dataset bias: Trained primarily on Codeforces problems, may not generalize well to other competitive programming platforms
- Language bias: Solutions may favor certain programming patterns common in the training data
- Size limitations: Being a 2B parameter model, it may struggle with very complex algorithmic problems
- Code correctness: Generated code should always be tested and validated before use
Recommendations
- Always test generated solutions with multiple test cases
- Use the model as a starting point, not a final solution
- Verify algorithmic correctness and time complexity
- Consider the model's suggestions as one approach among many possible solutions
Environmental Impact
Training was conducted on a single Tesla T4 GPU for approximately 20 minutes, resulting in minimal environmental impact compared to larger scale training runs.
- Hardware Type: Tesla T4 GPU
- Hours used: 0.33 hours
- Cloud Provider: Kaggle
- Compute Region: Not specified
- Carbon Emitted: Minimal (estimated < 0.1 kg CO2eq)
Technical Specifications
Model Architecture and Objective
- Base Architecture: Gemma-2 (2B parameters)
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Objective: Causal language modeling with supervised fine-tuning
- Quantization: 4-bit quantization during training
Compute Infrastructure
Hardware
- GPU: Tesla T4 (16GB VRAM)
- Platform: Kaggle Notebooks
Software
- Framework: PyTorch, Transformers, PEFT, TRL
- Quantization: bitsandbytes 4-bit
- Training: Supervised Fine-Tuning (SFT)
Model Card Authors
Created by Aswith77 during fine-tuning experiments with competitive programming datasets.
Model Card Contact
For questions or issues regarding this model, please open an issue in the model repository.
- Downloads last month
- 8