File size: 2,946 Bytes
e48268c 994bc32 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
---
license: mit
datasets:
- fka/awesome-chatgpt-prompts
language:
- en
metrics:
- accuracy
base_model:
- Qwen/Qwen-Image
new_version: openai/gpt-oss-120b
pipeline_tag: token-classification
library_name: fastai
tags:
- code
---
# Using a Trained Mini-GPT Model (Safetensors)
This guide explains how to **load a trained Mini-GPT model** saved in `safetensors` format and generate text using it.
It is written in a step-by-step manner for **learning and understanding**.
---
## 1️⃣ Install Required Packages
Make sure you have the necessary packages:
```bash
pip install torch transformers safetensors
```
## 2️⃣ Load the Trained Model and Tokenizer
We saved our model earlier in ./mini_gpt_safetensor.
Here’s how to load it:
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_path = "./mini_gpt_safetensor" # Path to your saved model
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_path)
tokenizer.pad_token = tokenizer.eos_token # GPT models don't have pad_token
# Load model
model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto") # auto uses GPU if available
```
Note: Using device_map="auto" will load the model on GPU if available, otherwise CPU.
## 3️⃣ Generate Text from a Prompt
Once the model is loaded, we can generate text using a simple function:
```python
def generate_text(prompt, max_length=50):
# Tokenize prompt
input_ids = tokenizer(prompt, return_tensors="pt").input_ids
input_ids = input_ids.to(model.device)
# Generate text
output_ids = model.generate(
input_ids,
max_length=max_length,
do_sample=True, # enable randomness
top_k=50, # sample from top 50 tokens
top_p=0.95, # nucleus sampling
temperature=0.7, # creativity factor
num_return_sequences=1
)
# Decode output
output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
return output_text
```
Tip:
- do_sample=True → random outputs for creativity
- top_k and top_p → control sampling probability
- temperature → higher value = more creative output
## 4️⃣ Test Text Generation
Use your function with any prompt:
```python
prompt = "Hello, I am training a mini GPT model"
generated_text = generate_text(prompt, max_length=50)
print("\n📝 Generated text:")
print(generated_text)
```
Example output:
```css
Hello, I am training a mini GPT model to generate simple sentences about Python, deep learning, and AI projects.
```
✅ Summary
- Load the tokenizer and model from the safetensors folder.
- Use generate with proper sampling parameters for creative text.
- Decode the output to get readable text.
- You can experiment with prompt, max_length, top_k, top_p, and temperature to control text generation.
- By following this MDX guide, you can easily load any trained Mini-GPT model and generate text interactively. |