Model Card for outputs_sft
outputs_sft is a Supervised Fine-Tuning (SFT) LoRA adapter on top of [Qwen/Qwen3-4B]. It was trained with TRL and PEFT.
Quick start
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from peft import AutoPeftModelForCausalLM
REPO_ID = "outputs_sft" # Replace with your Hub repo if different
# Load base model + merge LoRA on the fly (recommended for inference)
model = AutoPeftModelForCausalLM.from_pretrained(REPO_ID, device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(REPO_ID, use_fast=True)
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map="auto")
prompt = "If you had a time machine and could go only once, where and when would you go? Explain your reasoning."
out = pipe(prompt, max_new_tokens=256, do_sample=True, top_p=0.9, temperature=0.7)[0]["generated_text"]
print(out)
# Alternatively, if you already merged the LoRA and saved the full model weights:
# model = AutoModelForCausalLM.from_pretrained(REPO_ID, device_map="auto")
# tokenizer = AutoTokenizer.from_pretrained(REPO_ID, use_fast=True)
Intended uses & limitations
Intended uses
- General instruction following and helpful assistant style responses.
- Short-form reasoning and everyday Q&A.
- Creative writing, drafting, and rewriting.
Limitations
- Not evaluated for safety-critical or high-stakes domains.
- May produce inaccurate, biased, or undesired content.
- Long-chain reasoning may require specialized training.
Bias, risks, and limitations: Outputs may reflect biases present in training data. Review before use in production.
Training data
- Dataset not auto-detected from the notebook. Please document your data sources.
Training procedure
This model was trained with SFT using TRL/PEFT.
PEFT / LoRA Config
- lora_dropout:
0.05
- target_modules:
["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
Precision & Quantization
- load_in_4bit:
True
- bnb_4bit_compute_dtype:
float32
- dtype:
float32
Key hyperparameters
- num_train_epochs:
2
- per_device_train_batch_size:
10
- per_device_eval_batch_size:
10
- gradient_accumulation_steps:
2
- learning_rate:
9e-4
- lr_scheduler_type:
cosine
- logging_steps:
2
- save_steps:
8
- save_strategy:
steps
- bf16:
True
- fp16:
False
- seed:
42
Hardware & runtime
- GPU not detected from notebook logs.
Framework versions
- PEFT: 0.17.0
- TRL: 0.21.0
- Transformers: 4.55.1
- PyTorch: 2.8.0
- Datasets: 3.6.0
- Tokenizers: 0.21.4
Example prompts
Explain diffusion models to a 12-year-old.
Write a polite email asking for an extension on a project.
Summarize the following text in 3 bullet points: ...
Evaluation
No formal evaluation metrics were logged in the notebook. If you run evaluations (e.g., on MT-Bench, MMLU, or a domain-specific set), please add the results here in a Model Index block or a table.
Pushing to the Hub
from huggingface_hub import HfApi, create_repo, upload_folder
REPO_ID = "outputs_sft" # e.g., "YourUsername/outputs_sft"
# 1) Create the repo (once)
# create_repo(REPO_ID, repo_type="model", private=False)
# 2) Upload your adapter or merged model folder
upload_folder(
repo_id=REPO_ID,
folder_path="./outputs_sft", # change to your output dir
commit_message="Add SFT model",
)
License: Set
license
in the YAML header to a license compatible with the base model and your data (e.g.,apache-2.0
,mit
, or the specific Qwen license if required).
Citations
@misc{vonwerra2022trl,
title = {{TRL: Transformer Reinforcement Learning}},
author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
year = 2020,
journal = {GitHub repository},
publisher = {GitHub},
howpublished = {\url{https://github.com/huggingface/trl}}
}
@misc{mukherjee2023peft,
title = {PEFT: Parameter-Efficient Fine-Tuning},
author = {Edward Hu and others},
year = 2023,
howpublished = {\url{https://github.com/huggingface/peft}}
}
@inproceedings{wolf-etal-2020-transformers,
title = "Transformers: State-of-the-Art Natural Language Processing",
author = "Thomas Wolf and others",
booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
year = "2020"
}
- Downloads last month
- 19
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support