ankitkushwaha90 commited on
Commit
994bc32
·
verified ·
1 Parent(s): a9ff5bb

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +90 -0
README.md ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Using a Trained Mini-GPT Model (Safetensors)
2
+
3
+ This guide explains how to **load a trained Mini-GPT model** saved in `safetensors` format and generate text using it.
4
+ It is written in a step-by-step manner for **learning and understanding**.
5
+
6
+ ---
7
+
8
+ ## 1️⃣ Install Required Packages
9
+
10
+ Make sure you have the necessary packages:
11
+
12
+ ```bash
13
+ pip install torch transformers safetensors
14
+ ```
15
+ ## 2️⃣ Load the Trained Model and Tokenizer
16
+ We saved our model earlier in ./mini_gpt_safetensor.
17
+ Here’s how to load it:
18
+
19
+ ```python
20
+ import torch
21
+ from transformers import AutoTokenizer, AutoModelForCausalLM
22
+
23
+ model_path = "./mini_gpt_safetensor" # Path to your saved model
24
+
25
+ # Load tokenizer
26
+ tokenizer = AutoTokenizer.from_pretrained(model_path)
27
+ tokenizer.pad_token = tokenizer.eos_token # GPT models don't have pad_token
28
+
29
+ # Load model
30
+ model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto") # auto uses GPU if available
31
+ ```
32
+ Note: Using device_map="auto" will load the model on GPU if available, otherwise CPU.
33
+
34
+ ## 3️⃣ Generate Text from a Prompt
35
+ Once the model is loaded, we can generate text using a simple function:
36
+
37
+ ```python
38
+ def generate_text(prompt, max_length=50):
39
+ # Tokenize prompt
40
+ input_ids = tokenizer(prompt, return_tensors="pt").input_ids
41
+ input_ids = input_ids.to(model.device)
42
+
43
+ # Generate text
44
+ output_ids = model.generate(
45
+ input_ids,
46
+ max_length=max_length,
47
+ do_sample=True, # enable randomness
48
+ top_k=50, # sample from top 50 tokens
49
+ top_p=0.95, # nucleus sampling
50
+ temperature=0.7, # creativity factor
51
+ num_return_sequences=1
52
+ )
53
+
54
+ # Decode output
55
+ output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
56
+ return output_text
57
+ ```
58
+ Tip:
59
+
60
+ - do_sample=True → random outputs for creativity
61
+
62
+ - top_k and top_p → control sampling probability
63
+
64
+ - temperature → higher value = more creative output
65
+
66
+ ## 4️⃣ Test Text Generation
67
+ Use your function with any prompt:
68
+
69
+ ```python
70
+ prompt = "Hello, I am training a mini GPT model"
71
+ generated_text = generate_text(prompt, max_length=50)
72
+
73
+ print("\n📝 Generated text:")
74
+ print(generated_text)
75
+ ```
76
+ Example output:
77
+
78
+ ```css
79
+ Hello, I am training a mini GPT model to generate simple sentences about Python, deep learning, and AI projects.
80
+ ```
81
+ ✅ Summary
82
+ - Load the tokenizer and model from the safetensors folder.
83
+
84
+ - Use generate with proper sampling parameters for creative text.
85
+
86
+ - Decode the output to get readable text.
87
+
88
+ - You can experiment with prompt, max_length, top_k, top_p, and temperature to control text generation.
89
+
90
+ - By following this MDX guide, you can easily load any trained Mini-GPT model and generate text interactively.