Gemma-2-2b Fine-tuned for Competitive Programming

This model is a fine-tuned version of google/gemma-2-2b-it on the open-r1/codeforces-cots dataset for competitive programming problem solving.

Model Details

Model Description

This model has been fine-tuned using LoRA (Low-Rank Adaptation) on competitive programming problems from Codeforces. It's designed to help generate solutions for algorithmic and data structure problems commonly found in competitive programming contests.

  • Developed by: Aswith77
  • Model type: Causal Language Model (Code Generation)
  • Language(s): Python, C++, Java (primarily Python)
  • License: MIT
  • Finetuned from model: google/gemma-2-2b-it
  • Fine-tuning method: LoRA (Low-Rank Adaptation)

Model Sources

Uses

Direct Use

This model is intended for generating solutions to competitive programming problems, particularly those similar to Codeforces problems. It can:

  • Generate algorithmic solutions for given problem statements
  • Help with code completion for competitive programming
  • Assist in learning algorithmic problem-solving patterns

Downstream Use

The model can be further fine-tuned on:

  • Specific programming languages
  • Domain-specific algorithmic problems
  • Educational coding platforms

Out-of-Scope Use

This model should not be used for:

  • Production code without thorough testing
  • Security-critical applications
  • General-purpose software development without validation
  • Problems requiring real-world system design

How to Get Started with the Model

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load the base model
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-2-2b-it",
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load the fine-tuned LoRA adapters
model = PeftModel.from_pretrained(
    base_model,
    "Aswith77/gemma-2-2b-it-finetune-codeforces-cots"
)

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2-2b-it")

# Generate code for a problem
problem = """
Given an array of integers, find the maximum sum of a contiguous subarray.
Input: [-2,1,-3,4,-1,2,1,-5,4]
Output: 6 (subarray [4,-1,2,1])
"""

inputs = tokenizer(problem, return_tensors="pt")
outputs = model.generate(
    **inputs,
    max_length=512,
    temperature=0.7,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)

solution = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(solution)

Training Details

Training Data

The model was trained on the open-r1/codeforces-cots dataset, specifically using 1,000 competitive programming problems and their solutions from Codeforces.

Training Procedure

Training Hyperparameters

  • Training regime: fp16 mixed precision
  • Learning rate: 2e-4
  • Batch size: 1 (per device)
  • Gradient accumulation steps: 2
  • Max steps: 100
  • Warmup steps: 5
  • Optimizer: AdamW 8-bit
  • Weight decay: 0.01
  • LoRA rank (r): 16
  • LoRA alpha: 32
  • LoRA dropout: 0.1
  • Target modules: q_proj, v_proj, k_proj, o_proj, gate_proj, up_proj, down_proj

Speeds, Sizes, Times

  • Training time: ~20 minutes
  • Hardware: Tesla T4 GPU (16GB)
  • Model size: ~30MB (LoRA adapters only)
  • Final training loss: 0.3715
  • Training samples per second: 0.338

Evaluation

The model achieved a training loss of 0.3715, showing good convergence from an initial loss of 0.9303. The loss curve demonstrated steady improvement throughout training without signs of overfitting.

Training Loss Progression

  • Initial loss: 0.9303
  • Final loss: 0.3715
  • Loss reduction: ~60%

Bias, Risks, and Limitations

Limitations

  • Dataset bias: Trained primarily on Codeforces problems, may not generalize well to other competitive programming platforms
  • Language bias: Solutions may favor certain programming patterns common in the training data
  • Size limitations: Being a 2B parameter model, it may struggle with very complex algorithmic problems
  • Code correctness: Generated code should always be tested and validated before use

Recommendations

  • Always test generated solutions with multiple test cases
  • Use the model as a starting point, not a final solution
  • Verify algorithmic correctness and time complexity
  • Consider the model's suggestions as one approach among many possible solutions

Environmental Impact

Training was conducted on a single Tesla T4 GPU for approximately 20 minutes, resulting in minimal environmental impact compared to larger scale training runs.

  • Hardware Type: Tesla T4 GPU
  • Hours used: 0.33 hours
  • Cloud Provider: Kaggle
  • Compute Region: Not specified
  • Carbon Emitted: Minimal (estimated < 0.1 kg CO2eq)

Technical Specifications

Model Architecture and Objective

  • Base Architecture: Gemma-2 (2B parameters)
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Objective: Causal language modeling with supervised fine-tuning
  • Quantization: 4-bit quantization during training

Compute Infrastructure

Hardware

  • GPU: Tesla T4 (16GB VRAM)
  • Platform: Kaggle Notebooks

Software

  • Framework: PyTorch, Transformers, PEFT, TRL
  • Quantization: bitsandbytes 4-bit
  • Training: Supervised Fine-Tuning (SFT)

Model Card Authors

Created by Aswith77 during fine-tuning experiments with competitive programming datasets.

Model Card Contact

For questions or issues regarding this model, please open an issue in the model repository.

Downloads last month
8
Safetensors
Model size
1.65B params
Tensor type
F32
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Aswith77/gemma-2-2b-it-finetune-codeforces-cots

Base model

google/gemma-2-2b
Adapter
(248)
this model

Dataset used to train Aswith77/gemma-2-2b-it-finetune-codeforces-cots