timopetric's picture
Update README.md
d4bcc51 verified
metadata
license: gemma
language:
  - sl
  - en
base_model:
  - cjvt/GaMS-9B-Instruct
datasets:
  - nvidia/Nemotron-Post-Training-Dataset-v1
pipeline_tag: text-generation

Model Card for GaMS-9B-Instruct-Nemotron

GaMS-9B-Instruct-Nemotron is a variant of GaMS-9B-Instruct, further trained with supervised fine-tuning (SFT) on a curated subset of the chat part of nvidia/Nemotron-Post-Training-Dataset-v1. The training data included ~80k Slovenian instruction–response pairs (based on translations with additional modifications to adjust identity and context) and ~20k English examples.

This instruction tuned version of GaMS 9B was developed as part of a master thesis project by Timotej Petrič. Further details will follow.

image/png

Acknowledgment

The model was developed within the PoVeJMo research program (Adaptive Natural Language Processing with Large Language Models), particularly within the research project titled SloLLaMai -- Open-access computationally efficient models for Slovenian. The program is funded within the Recovery and Resilience Plan by the Slovenian Research and Innovation Agency (ARIS) and NextGenerationEU. The authors also acknowledge the financial support from the Slovenian Research and Innovation Agency (research core funding No. P6-0411 -- Language Resources and Technologies for Slovene).

Usage

The model can be run through pipeline API using the following code:

from transformers import pipeline

model_id = "GaMS-Beta/GaMS-9B-Instruct-Nemotron"

pline = pipeline(
    "text-generation",
    model=model_id,
    device_map="cuda" # replace with "mps" to run on a Mac device
)

# Example of response generation
message = [{"role": "user", "content": f"Pozdravljen. Kdo si?"}]
response = pline(message, max_new_tokens=512)
print("Slovene translation:", response[0]["generated_text"][-1]["content"])