Gemma-3 270M Fine-tuned (XSum)

This is a fine-tuned version of google/gemma-3-270m using the XSum dataset.
The model was trained with Unsloth for efficient fine-tuning and the LoRA adapters have been merged into the model weights.

Model Details

Base model: google/gemma-3-270m
Architecture: Gemma-3, 270M parameters
Training framework: Unsloth
Task: Abstractive summarization
Dataset: XSum
Adapter merge: Yes (LoRA weights merged into final model)
Precision: Full precision (no 4bit/8bit quantization used)

Training Configuration

The model was fine-tuned starting from unsloth/gemma-3-270m-it using LoRA adapters with the Unsloth framework.
The LoRA adapters were later merged into the base model weights.

Base model: unsloth/gemma-3-270m-it\
Sequence length: 2048 \
Quantization: not used (no 4-bit or 8-bit)\
Full finetuning: disabled (LoRA fine-tuning only)

LoRA Setup

Rank (r): 128\
Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj\
LoRA alpha: 128\
LoRA dropout: 0\

Training Details

Dataset: XSum\
Batch size per device: 128\
Gradient accumulation steps: 1\
Warmup steps: 5\
Training epochs: 1 \
Learning rate: 5e-5 (linear schedule)\

Intended Use

Primary use case: Abstractive summarization of long-form text (news-style)
Not suitable for: Factual Q&A, reasoning, coding, or tasks requiring large-context models
Limitations: Small model size (270M) means limited reasoning ability compared to larger Gemma models

Example Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ShahzebKhoso/Gemma3_270M_FineTuned_XSUM"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

text = "The UK government announced new measures to support renewable energy."
inputs = tokenizer(text, return_tensors="pt")

outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

License

Base model license: Gemma License
Dataset license: XSum (CC BY-NC-SA 4.0)

Acknowledgements

Unsloth for efficient finetuning
Google DeepMind for Gemma-3
EdinburghNLP for XSum dataset

ShahzebKhoso
/

Gemma3_270M_FineTuned_XSUM