unlimitedbytes commited on
Commit
1951d49
·
verified ·
1 Parent(s): 0a2552d

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +44 -45
README.md CHANGED
@@ -1,62 +1,61 @@
1
  ---
 
2
  base_model: openai/gpt-oss-20b
3
- library_name: peft
4
- model_name: gptoss-bigcodebench-full
5
  tags:
6
- - base_model:adapter:openai/gpt-oss-20b
7
  - lora
8
- - sft
9
- - transformers
10
- - trl
11
- licence: license
12
- pipeline_tag: text-generation
13
  ---
 
14
 
15
- # Model Card for gptoss-bigcodebench-full
16
 
17
- This model is a fine-tuned version of [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b).
18
- It has been trained using [TRL](https://github.com/huggingface/trl).
19
 
20
- ## Quick start
 
 
 
 
21
 
22
- ```python
23
- from transformers import pipeline
24
 
25
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
26
- generator = pipeline("text-generation", model="None", device="cuda")
27
- output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
28
- print(output["generated_text"])
 
 
 
 
 
 
 
 
 
 
 
29
  ```
30
 
31
- ## Training procedure
32
-
33
-
34
 
 
 
 
 
35
 
36
- This model was trained with SFT.
37
-
38
- ### Framework versions
39
-
40
- - PEFT 0.17.0
41
- - TRL: 0.20.0
42
- - Transformers: 4.55.0
43
- - Pytorch: 2.8.0.dev20250319+cu128
44
- - Datasets: 4.0.0
45
- - Tokenizers: 0.21.4
46
-
47
- ## Citations
48
 
 
 
 
49
 
 
50
 
51
- Cite TRL as:
52
-
53
- ```bibtex
54
- @misc{vonwerra2022trl,
55
- title = {{TRL: Transformer Reinforcement Learning}},
56
- author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
57
- year = 2020,
58
- journal = {GitHub repository},
59
- publisher = {GitHub},
60
- howpublished = {\url{https://github.com/huggingface/trl}}
61
- }
62
- ```
 
1
  ---
2
+ license: apache-2.0
3
  base_model: openai/gpt-oss-20b
4
+ library_name: transformers
5
+ pipeline_tag: text-generation
6
  tags:
7
+ - peft
8
  - lora
9
+ - bigcodebench
10
+ - gpt-oss
11
+ - code
12
+ - causal-lm
13
+ inference: false
14
  ---
15
+ # GPT-OSS-20B BigCodeBench LoRA Adapter
16
 
17
+ LoRA adapter weights fine-tuned from `openai/gpt-oss-20b` on BigCodeBench split `v0.1.4` (~1.1K samples).
18
 
19
+ ## Training Summary
 
20
 
21
+ - Steps: 100
22
+ - Final train_loss: 0.7833267974853516
23
+ - Runtime (s): 3717.3139
24
+ - Samples/sec: 0.43
25
+ - Total FLOPs: 6.825417425085542e+16
26
 
27
+ ## Usage
 
28
 
29
+ ```python
30
+ from transformers import AutoModelForCausalLM, AutoTokenizer
31
+ from peft import PeftModel
32
+ base = 'openai/gpt-oss-20b'
33
+ adapter = 'unlimitedbytes/gptoss-bigcodebench-20b-lora'
34
+ model = AutoModelForCausalLM.from_pretrained(base, device_map='auto', torch_dtype='auto')
35
+ model = PeftModel.from_pretrained(model, adapter)
36
+ tokenizer = AutoTokenizer.from_pretrained(base)
37
+ messages = [
38
+ {'role': 'system', 'content': 'You are a helpful coding assistant.'},
39
+ {'role': 'user', 'content': 'Write a Python function to add two numbers.'}
40
+ ]
41
+ input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors='pt').to(model.device)
42
+ out = model.generate(input_ids, max_new_tokens=128)
43
+ print(tokenizer.decode(out[0], skip_special_tokens=False))
44
  ```
45
 
46
+ Merge adapter:
 
 
47
 
48
+ ```python
49
+ model = model.merge_and_unload()
50
+ model.save_pretrained('merged-model')
51
+ ```
52
 
53
+ ## Limitations
 
 
 
 
 
 
 
 
 
 
 
54
 
55
+ - 100 training steps only; not fully converged.
56
+ - Adapter only, no merged full weights.
57
+ - Outputs may include control tokens.
58
 
59
+ ## License
60
 
61
+ Apache-2.0 (base) + dataset licenses.