gpt2 / README.md
onisj's picture
Update README.md
0d3ff14 verified
---
license: mit
language:
- en
tags:
- mental-health
- text-generation
- conversational-ai
- gpt2
pipeline_tag: text-generation
base_model: openai-community/gpt2
---
# GPT-2 (Fine-Tuned for MindPadi)
This is a fine-tuned version of GPT-2 for the **MindPadi** mental health chatbot. It has been adapted to generate empathetic, therapeutic, and contextually relevant responses for mental health support. It is the primary generative model used in long-form conversation and therapy-related dialogue management in MindPadi.
## 🧠 Model Summary
- **Model Type:** GPT-2 (12-layer transformer)
- **Parameters:** ~124M
- **Fine-Tuned For:** Empathetic and supportive text generation
- **Used In:** `app/chatbot/fusion_bot.py`
- **Architecture:** Decoder-only transformer (causal LM)
- **Framework:** Hugging Face Transformers + PyTorch
## 🧾 Intended Use
### ✔️ Primary Use Cases
- Generating thoughtful, compassionate responses in mental health conversations
- Completing sentences in a therapy dialogue setting
- Supporting GPTFusion workflows in MindPadi backend
### 🚫 Not Recommended For
- Clinical diagnoses or medical decisions
- Domains outside mental health (e.g., finance, legal)
- Multilingual generation (trained on English only)
## 🏋️‍♀️ Training Details
- **Base Model:** `gpt2` from Hugging Face
- **Fine-Tuning Script:** `training/finetune_gpt2_pipeline.py`
- **Datasets:** Mental health dialogue datasets (e.g., therapy transcripts, Reddit mental health subreddits)
- Location: `training/datasets/finetuned/`
- **Preprocessing:**
- Cleaned for profanity, PII, and formatting noise
- Converted to conversation format: `User: ... Assistant: ...`
### Hyperparameters
- Epochs: 3–5
- Batch Size: 4–8
- Learning Rate: 5e-5
- Warmup Steps: 200
- Optimizer: AdamW
### Hardware
- NVIDIA RTX 2080 / A100 / equivalent (local or cloud)
## 📈 Evaluation
- **Evaluation Script:** `training/evaluate_model.py`
- **Metrics:**
- Perplexity: Reduced ~20% compared to base GPT-2
- BLEU Score: Improved ~12% in test responses
- Human Evaluation: Higher relevance and emotional alignment in blind tests
- **Examples:** Available in `logs/training.log` and test dialogues
## 📂 Files
| File | Purpose |
|------|---------|
| `config.json` | Model architecture |
| `pytorch_model.bin` or `model.safetensors` | Fine-tuned weights |
| `tokenizer.json`, `vocab.json`, `merges.txt` | Tokenizer and BPE merge rules |
| `checkpoint-*/` | Intermediate training checkpoints (optional) |
## 💬 Example Usage
```python
from transformers import GPT2LMHeadModel, GPT2Tokenizer
model = GPT2LMHeadModel.from_pretrained("mindpadi/gpt2")
tokenizer = GPT2Tokenizer.from_pretrained("mindpadi/gpt2")
prompt = "User: I feel hopeless and tired.\nAssistant:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_length=100, do_sample=True)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
````
## 🔧 Integration in MindPadi
This model is integrated into:
* `app/chatbot/fusion_bot.py`: Primary text generator
* `app/chatbot/gpt_router.py`: Fusion routing between GPT-2 and DistilGPT2
* `app/chatbot/core.py`: Chat interface logic
* LangGraph workflows: Via GPTFusionRouter nodes
## ⚠️ Limitations
* **Bias:** May carry biases from internet pretraining or mental health corpora
* **Language:** English-only
* **Token Limit:** \~1024 tokens context window
* **Sensitivity:** May generate inappropriate responses if inputs are adversarial or out-of-distribution
* **Non-Determinism:** Sampling may produce varied outputs even for the same input
## 🔐 Ethical Considerations
* Not intended for crisis response or as a substitute for professional help
* Includes guardrails to detect emergency keywords (e.g., "suicide", "abuse")
* Should always inform users they are interacting with AI (see `chat_interface.py`)
* Responses are not clinically validated; user discretion advised
## 🧪 Deployment
You can deploy this model via Hugging Face Inference Endpoints for GPU-powered responses.
```python
import requests
api_url = "https://<your-endpoint>.hf.space"
headers = {"Authorization": f"Bearer <your-token>", "Content-Type": "application/json"}
payload = {"inputs": "User: I feel anxious.\nAssistant:"}
response = requests.post(api_url, headers=headers, json=payload)
print(response.json())
```
## 📜 License
MIT License – Free for commercial and non-commercial use with attribution.
## 📬 Contact
* **Project:** [MindPadi AI Chatbot](https://huggingface.co/mindpadi)
* **Maintainer:** MindPadi Team
* **Email:** \[[[email protected]](mailto:[email protected])]
* **GitHub:** \[github.com/mindpadi/mindpadi]
*Last updated: May 2025*