Gemma-3 270M Fine-tuned (XSum)

This is a fine-tuned version of google/gemma-3-270m using the XSum dataset.
The model was trained with Unsloth for efficient fine-tuning and the LoRA adapters have been merged into the model weights.


Model Details

  • Base model: google/gemma-3-270m
  • Architecture: Gemma-3, 270M parameters
  • Training framework: Unsloth
  • Task: Abstractive summarization
  • Dataset: XSum
  • Adapter merge: Yes (LoRA weights merged into final model)
  • Precision: Full precision (no 4bit/8bit quantization used)

Training Configuration

The model was fine-tuned starting from unsloth/gemma-3-270m-it using LoRA adapters with the Unsloth framework.
The LoRA adapters were later merged into the base model weights.

  • Base model: unsloth/gemma-3-270m-it\
  • Sequence length: 2048 \
  • Quantization: not used (no 4-bit or 8-bit)\
  • Full finetuning: disabled (LoRA fine-tuning only)

LoRA Setup

  • Rank (r): 128\
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj\
  • LoRA alpha: 128\
  • LoRA dropout: 0\

Training Details

  • Dataset: XSum\
  • Batch size per device: 128\
  • Gradient accumulation steps: 1\
  • Warmup steps: 5\
  • Training epochs: 1 \
  • Learning rate: 5e-5 (linear schedule)\

Intended Use

  • Primary use case: Abstractive summarization of long-form text (news-style)
  • Not suitable for: Factual Q&A, reasoning, coding, or tasks requiring large-context models
  • Limitations: Small model size (270M) means limited reasoning ability compared to larger Gemma models

Example Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ShahzebKhoso/Gemma3_270M_FineTuned_XSUM"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

text = "The UK government announced new measures to support renewable energy."
inputs = tokenizer(text, return_tensors="pt")

outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

License


Acknowledgements

Downloads last month
10
Safetensors
Model size
268M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ShahzebKhoso/Gemma3_270M_FineTuned_XSUM

Finetuned
(285)
this model

Dataset used to train ShahzebKhoso/Gemma3_270M_FineTuned_XSUM