unlimitedbytes
/

gptoss-bigcodebench-20b-lora

@@ -1,62 +1,61 @@
 ---
 base_model: openai/gpt-oss-20b
-library_name: peft
-model_name: gptoss-bigcodebench-full
 tags:
-- base_model:adapter:openai/gpt-oss-20b
 - lora
-- sft
-- transformers
-- trl
-licence: license
-pipeline_tag: text-generation
 ---
-# Model Card for gptoss-bigcodebench-full
-This model is a fine-tuned version of [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b).
-It has been trained using [TRL](https://github.com/huggingface/trl).
-## Quick start
-```python
-from transformers import pipeline
-question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
-generator = pipeline("text-generation", model="None", device="cuda")
-output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
-print(output["generated_text"])
 ```
-## Training procedure
-This model was trained with SFT.
-### Framework versions
-- PEFT 0.17.0
-- TRL: 0.20.0
-- Transformers: 4.55.0
-- Pytorch: 2.8.0.dev20250319+cu128
-- Datasets: 4.0.0
-- Tokenizers: 0.21.4
-## Citations
-Cite TRL as:
-```bibtex
-@misc{vonwerra2022trl,
-	title        = {{TRL: Transformer Reinforcement Learning}},
-	author       = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
-	year         = 2020,
-	journal      = {GitHub repository},
-	publisher    = {GitHub},
-	howpublished = {\url{https://github.com/huggingface/trl}}
-}
-```

 ---
+license: apache-2.0
 base_model: openai/gpt-oss-20b
+library_name: transformers
+pipeline_tag: text-generation
 tags:
+- peft
 - lora
+- bigcodebench
+- gpt-oss
+- code
+- causal-lm
+inference: false
 ---
+# GPT-OSS-20B BigCodeBench LoRA Adapter
+LoRA adapter weights fine-tuned from `openai/gpt-oss-20b` on BigCodeBench split `v0.1.4` (~1.1K samples).
+## Training Summary
+- Steps: 100
+- Final train_loss: 0.7833267974853516
+- Runtime (s): 3717.3139
+- Samples/sec: 0.43
+- Total FLOPs: 6.825417425085542e+16
+## Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from peft import PeftModel
+base = 'openai/gpt-oss-20b'
+adapter = 'unlimitedbytes/gptoss-bigcodebench-20b-lora'
+model = AutoModelForCausalLM.from_pretrained(base, device_map='auto', torch_dtype='auto')
+model = PeftModel.from_pretrained(model, adapter)
+tokenizer = AutoTokenizer.from_pretrained(base)
+messages = [
+    {'role': 'system', 'content': 'You are a helpful coding assistant.'},
+    {'role': 'user', 'content': 'Write a Python function to add two numbers.'}
+]
+input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors='pt').to(model.device)
+out = model.generate(input_ids, max_new_tokens=128)
+print(tokenizer.decode(out[0], skip_special_tokens=False))
 ```
+Merge adapter:
+```python
+model = model.merge_and_unload()
+model.save_pretrained('merged-model')
+```
+## Limitations
+- 100 training steps only; not fully converged.
+- Adapter only, no merged full weights.
+- Outputs may include control tokens.
+## License
+Apache-2.0 (base) + dataset licenses.