ankitkushwaha90
/

distilgpt2-finetune-huggingface

Token Classification

Model card Files Files and versions

distilgpt2-finetune-huggingface / README.md

ankitkushwaha90's picture

ankitkushwaha90

Update README.md

e48268c verified 11 days ago

|

history blame contribute delete

2.95 kB

	---
	license: mit
	datasets:
	- fka/awesome-chatgpt-prompts
	language:
	- en
	metrics:
	- accuracy
	base_model:
	- Qwen/Qwen-Image
	new_version: openai/gpt-oss-120b
	pipeline_tag: token-classification
	library_name: fastai
	tags:
	- code
	---
	# Using a Trained Mini-GPT Model (Safetensors)

	This guide explains how to load a trained Mini-GPT model saved in `safetensors` format and generate text using it.
	It is written in a step-by-step manner for learning and understanding.

	---

	## 1️⃣ Install Required Packages

	Make sure you have the necessary packages:

	```bash
	pip install torch transformers safetensors
	```
	## 2️⃣ Load the Trained Model and Tokenizer
	We saved our model earlier in ./mini_gpt_safetensor.
	Here’s how to load it:

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_path = "./mini_gpt_safetensor" # Path to your saved model

	# Load tokenizer
	tokenizer = AutoTokenizer.from_pretrained(model_path)
	tokenizer.pad_token = tokenizer.eos_token # GPT models don't have pad_token

	# Load model
	model = AutoModelForCausalLM.from_pretrained(model_path, device_map="auto") # auto uses GPU if available
	```
	Note: Using device_map="auto" will load the model on GPU if available, otherwise CPU.

	## 3️⃣ Generate Text from a Prompt
	Once the model is loaded, we can generate text using a simple function:

	```python
	def generate_text(prompt, max_length=50):
	# Tokenize prompt
	input_ids = tokenizer(prompt, return_tensors="pt").input_ids
	input_ids = input_ids.to(model.device)

	# Generate text
	output_ids = model.generate(
	input_ids,
	max_length=max_length,
	do_sample=True, # enable randomness
	top_k=50, # sample from top 50 tokens
	top_p=0.95, # nucleus sampling
	temperature=0.7, # creativity factor
	num_return_sequences=1
	)

	# Decode output
	output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
	return output_text
	```
	Tip:

	- do_sample=True → random outputs for creativity

	- top_k and top_p → control sampling probability

	- temperature → higher value = more creative output

	## 4️⃣ Test Text Generation
	Use your function with any prompt:

	```python
	prompt = "Hello, I am training a mini GPT model"
	generated_text = generate_text(prompt, max_length=50)

	print("\n📝 Generated text:")
	print(generated_text)
	```
	Example output:

	```css
	Hello, I am training a mini GPT model to generate simple sentences about Python, deep learning, and AI projects.
	```
	✅ Summary
	- Load the tokenizer and model from the safetensors folder.

	- Use generate with proper sampling parameters for creative text.

	- Decode the output to get readable text.

	- You can experiment with prompt, max_length, top_k, top_p, and temperature to control text generation.

	- By following this MDX guide, you can easily load any trained Mini-GPT model and generate text interactively.