Day1Kim commited on
Commit
4014504
·
verified ·
1 Parent(s): 759a373

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -3
README.md CHANGED
@@ -7,28 +7,80 @@ tags:
7
  - generated_from_trainer
8
  - sft
9
  - trl
 
 
10
  licence: license
 
 
 
 
11
  ---
12
 
13
  # Model Card for gpt-oss-20b-korean-reasoner
14
 
15
  This model is a fine-tuned version of [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b) on the [Day1Kim/Multilingual-Thinking-KO](https://huggingface.co/datasets/Day1Kim/Multilingual-Thinking-KO) dataset.
16
- It has been trained using [TRL](https://github.com/huggingface/trl).
 
 
17
 
18
  ## Quick start
19
 
20
  ```python
21
  from transformers import pipeline
22
 
23
- question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
24
  generator = pipeline("text-generation", model="Day1Kim/gpt-oss-20b-korean-reasoner", device="cuda")
25
  output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
26
  print(output["generated_text"])
27
  ```
28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  ## Training procedure
30
 
31
-
 
 
 
32
 
33
 
34
  This model was trained with SFT.
 
7
  - generated_from_trainer
8
  - sft
9
  - trl
10
+ - korean
11
+ - 한국어
12
  licence: license
13
+ license: apache-2.0
14
+ language:
15
+ - ko
16
+ pipeline_tag: text-generation
17
  ---
18
 
19
  # Model Card for gpt-oss-20b-korean-reasoner
20
 
21
  This model is a fine-tuned version of [openai/gpt-oss-20b](https://huggingface.co/openai/gpt-oss-20b) on the [Day1Kim/Multilingual-Thinking-KO](https://huggingface.co/datasets/Day1Kim/Multilingual-Thinking-KO) dataset.
22
+ It has been trained using [TRL](https://github.com/huggingface/trl).
23
+
24
+ 한국어 thinking 데이터셋 기반 파인튜닝된 모델.
25
 
26
  ## Quick start
27
 
28
  ```python
29
  from transformers import pipeline
30
 
31
+ question = "한국의 수도는?"
32
  generator = pipeline("text-generation", model="Day1Kim/gpt-oss-20b-korean-reasoner", device="cuda")
33
  output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
34
  print(output["generated_text"])
35
  ```
36
 
37
+ ### 모델 로드
38
+
39
+ ```python
40
+ from transformers import AutoModelForCausalLM, AutoTokenizer
41
+ from peft import PeftModel
42
+ import torch
43
+
44
+ # Load the tokenizer
45
+ tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-20b")
46
+
47
+ # Load the original model first
48
+ model_kwargs = dict(attn_implementation="eager", torch_dtype="auto", use_cache=True, device_map="auto")
49
+ base_model = AutoModelForCausalLM.from_pretrained("openai/gpt-oss-20b", **model_kwargs).cuda()
50
+
51
+ # Merge fine-tuned weights with the base model
52
+ peft_model_id = "gpt-oss-20b-korean-reasoner"
53
+ model = PeftModel.from_pretrained(base_model, peft_model_id)
54
+ model = model.merge_and_unload()
55
+
56
+ REASONING_LANGUAGE = "Korean"
57
+ SYSTEM_PROMPT = f"reasoning language: {REASONING_LANGUAGE}"
58
+ USER_PROMPT = "한국의 수도는?"
59
+
60
+ messages = [
61
+ {"role": "system", "content": SYSTEM_PROMPT},
62
+ {"role": "user", "content": USER_PROMPT},
63
+ ]
64
+
65
+ input_ids = tokenizer.apply_chat_template(
66
+ messages,
67
+ add_generation_prompt=True,
68
+ return_tensors="pt",
69
+ ).to(model.device)
70
+
71
+ gen_kwargs = {"max_new_tokens": 512, "do_sample": True, "temperature": 0.6, "top_p": None, "top_k": None}
72
+
73
+ output_ids = model.generate(input_ids, **gen_kwargs)
74
+ response = tokenizer.batch_decode(output_ids)[0]
75
+ print(response)
76
+ ```
77
+
78
  ## Training procedure
79
 
80
+ - **베이스 모델**: openai/gpt-oss-20b
81
+ - **훈련 스텝**: 65 steps
82
+ - **Epochs**: 5
83
+ - **데이터셋**: Day1Kim/Multilingual-Thinking-KO
84
 
85
 
86
  This model was trained with SFT.