ChickenMcSwag commited on
Commit
03d2169
·
verified ·
1 Parent(s): 4c3eeb6

Add model card

Browse files
Files changed (1) hide show
  1. README.md +71 -0
README.md ADDED
@@ -0,0 +1,71 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ base_model: openai/gpt-oss-20b
4
+ tags:
5
+ - gpt-oss-20b
6
+ - lora
7
+ - merged
8
+ - causal-lm
9
+ language:
10
+ - en
11
+ ---
12
+
13
+ # gpt-oss-20b-lora-finetuned_fp4_step_40
14
+
15
+ This is a merged model combining GPT-OSS-20B with a fine-tuned LoRA adapter.
16
+
17
+ ## Model Details
18
+
19
+ - **Base Model**: openai/gpt-oss-20b
20
+ - **LoRA Checkpoint**: checkpoint-40
21
+ - **Model Type**: Causal Language Model
22
+ - **Model Size**: ~20B parameters
23
+ - **Tensor Type**: bfloat16
24
+
25
+ ## LoRA Configuration
26
+
27
+ - **Rank (r)**: 8
28
+ - **Alpha**: 16
29
+ - **Target Modules**: k_proj, v_proj, o_proj, q_proj
30
+ - **Special MLP Expert Layers**: Layers 7, 15, 23
31
+
32
+ ## Quick Start
33
+
34
+ ```python
35
+ from transformers import AutoModelForCausalLM, AutoTokenizer
36
+
37
+ # Load model and tokenizer
38
+ model = AutoModelForCausalLM.from_pretrained(
39
+ "ChickenMcSwag/gpt-oss-20b-lora-finetuned_fp4_step_40",
40
+ torch_dtype="auto",
41
+ device_map="auto",
42
+ trust_remote_code=True
43
+ )
44
+ tokenizer = AutoTokenizer.from_pretrained("ChickenMcSwag/gpt-oss-20b-lora-finetuned_fp4_step_40")
45
+
46
+ # Generate text
47
+ prompt = "The future of AI is"
48
+ inputs = tokenizer(prompt, return_tensors="pt")
49
+ outputs = model.generate(
50
+ **inputs,
51
+ max_length=100,
52
+ temperature=0.7,
53
+ do_sample=True,
54
+ top_p=0.95
55
+ )
56
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
57
+ print(response)
58
+ ```
59
+
60
+ ## Hardware Requirements
61
+
62
+ - **Minimum VRAM**: ~40GB for inference
63
+ - **Recommended**: 2x A100 80GB or equivalent
64
+
65
+ ## License
66
+
67
+ This model follows the original GPT-OSS-20B license. Please refer to the base model's license and usage policy.
68
+
69
+ ## Citation
70
+
71
+ If you use this model, please cite the original GPT-OSS-20B model.