ankitkushwaha90
/

distilgpt2-finetune-huggingface

Token Classification

Model card Files Files and versions

ankitkushwaha90 commited on 12 days ago

Commit

994bc32

·

verified ·

1 Parent(s): a9ff5bb

Create README.md

Files changed (1) hide show

README.md +90 -0

README.md ADDED Viewed

	@@ -0,0 +1,90 @@

+# Using a Trained Mini-GPT Model (Safetensors)
+This guide explains how to **load a trained Mini-GPT model** saved in `safetensors` format and generate text using it.
+It is written in a step-by-step manner for **learning and understanding**.
+---
+## 1️⃣ Install Required Packages
+Make sure you have the necessary packages:
+```bash
+pip install torch transformers safetensors
+```
+## 2️⃣ Load the Trained Model and Tokenizer
+We saved our model earlier in ./mini_gpt_safetensor.
+Here’s how to load it:
+```python
+import torch
+from transformers import AutoTokenizer, AutoModelForCausalLM
+model_path = "./mini_gpt_safetensor"  # Path to your saved model
+# Load tokenizer
+tokenizer = AutoTokenizer.from_pretrained(model_path)
+tokenizer.pad_token = tokenizer.eos_token  # GPT models don't have pad_token
+# Load model
+model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto")  # auto uses GPU if available
+```
+Note: Using device_map="auto" will load the model on GPU if available, otherwise CPU.
+## 3️⃣ Generate Text from a Prompt
+Once the model is loaded, we can generate text using a simple function:
+```python
+def generate_text(prompt, max_length=50):
+    # Tokenize prompt
+    input_ids = tokenizer(prompt, return_tensors="pt").input_ids
+    input_ids = input_ids.to(model.device)
+    # Generate text
+    output_ids = model.generate(
+        input_ids,
+        max_length=max_length,
+        do_sample=True,   # enable randomness
+        top_k=50,         # sample from top 50 tokens
+        top_p=0.95,       # nucleus sampling
+        temperature=0.7,  # creativity factor
+        num_return_sequences=1
+    )
+    # Decode output
+    output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
+    return output_text
+```
+Tip:
+- do_sample=True → random outputs for creativity
+- top_k and top_p → control sampling probability
+- temperature → higher value = more creative output
+## 4️⃣ Test Text Generation
+Use your function with any prompt:
+```python
+prompt = "Hello, I am training a mini GPT model"
+generated_text = generate_text(prompt, max_length=50)
+print("\n📝 Generated text:")
+print(generated_text)
+```
+Example output:
+```css
+Hello, I am training a mini GPT model to generate simple sentences about Python, deep learning, and AI projects.
+```
+✅ Summary
+- Load the tokenizer and model from the safetensors folder.
+- Use generate with proper sampling parameters for creative text.
+- Decode the output to get readable text.
+- You can experiment with prompt, max_length, top_k, top_p, and temperature to control text generation.
+- By following this MDX guide, you can easily load any trained Mini-GPT model and generate text interactively.