Gemma-3 270M Teacher Evaluation Model
Model Description
This model is a fine-tuned version of unsloth/gemma-3-270m-it specifically trained to evaluate student texts like a teacher. It can assess educational texts and provide numerical evaluations on a scale of 0-5, similar to how a teacher would grade student work.
Training Details
- Base Model: Gemma-3 270M Instruct
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Training Dataset: Custom educational texts with teacher evaluations
- Training Samples: 3,646 training + 912 validation samples
- Epochs: 3
- Training Time: ~15 minutes
- Final Training Loss: 0.3782
- Final Validation Loss: 0.4566
Model Performance
The model achieved excellent generalization with minimal overfitting:
- Training loss decreased from 0.57 to 0.38 (-33% improvement)
- Validation loss remained stable around 0.45-0.46
- Small gap between training and validation loss indicates good generalization
Usage
from unsloth import FastModel
from transformers import AutoTokenizer
# Load the model
model, tokenizer = FastModel.from_pretrained("carlosvillu/gemma3-270m-teacher-evaluation")
# Example usage
prompt = '''<|begin_of_text|><|start_header_id|>system<|end_header_id|>
Eres una profesora de primera, tu tarea es la de evaluar los textos que está escribiendo tus alumnas como respuesta a una pregunta que le has formulado.
La evaluación, tiene que ser numérica y solo puedes ponerle una nota de 0 a 5.
<|eot_id|><|start_header_id|>user<|end_header_id|>
Un alumno del curso 2º de educación secundaria de cataluña ha respondido a la consigna `Describe tu ciudad` con:
Mi ciudad es muy bonita. Tiene muchas casas y parques. Me gusta vivir aquí porque hay tiendas y restaurantes.
<|eot_id|><|start_header_id|>assistant<|end_header_id|>'''
# Generate evaluation
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
Training Configuration
- LoRA rank: 128
- Learning rate: 2e-4
- Batch size: 8 (effective)
- Scheduler: Cosine with warmup
- Weight decay: 0.01
- Gradient clipping: 1.0
Evaluation Scale
The model evaluates texts on a scale of 0-5:
- 0: Substantially below expected quality for educational level
- 1: Slightly below expected quality
- 2: Barely sufficient quality
- 3: Expected quality for educational level
- 4: Slightly above expected quality
- 5: Substantially above expected quality
Created by
Fine-tuned by carlosvillu using Unsloth for efficient training.
Training date: 2025-08-17
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support