phi2-lora-instruct / README.md
howtomakepplragequit's picture
updated readme
f64d846 verified
---
license: mit
tags:
- generated-from-train
- instruction-tuned
- phi2
- lora
- low-resource
- fine-tuning
datasets:
- yahma/alpaca-cleaned
base_model: microsoft/phi-2
widget:
- text: "### Instruction:\nExplain the concept of gravity.\n\n### Response:"
---
# 🧠 phi2-lora-instruct
This is a **LoRA fine-tuned version of Microsoft’s Phi-2** model trained on 500 examples from the [`yahma/alpaca-cleaned`](https://huggingface.co/datasets/yahma/alpaca-cleaned) instruction dataset.
### ✅ Fine-Tuned by:
**[howtomakepplragequit](https://huggingface.co/howtomakepplragequit)** — working on scalable, efficient LLM training for real-world instruction-following.
---
## 🏗️ Model Architecture
- **Base model**: [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) (2.7B parameters)
- **Adapter**: LoRA (Low-Rank Adaptation), trained with [PEFT](https://github.com/huggingface/peft)
- **Quantization**: 4-bit NF4 via `bitsandbytes` for efficient memory use
---
## 📦 Dataset
- [`yahma/alpaca-cleaned`](https://huggingface.co/datasets/yahma/alpaca-cleaned)
- Instruction-based Q&A for natural language understanding and generation
- Covers topics like science, grammar, everyday tasks, and reasoning
---
## 🛠️ Training Details
- **Training platform**: Google Colab (Free T4 GPU)
- **Epochs**: 2
- **Batch size**: 2 (with gradient accumulation)
- **Optimizer**: AdamW (via Transformers `Trainer`)
- **Training time**: ~20–30 mins
---
## 📈 Intended Use
- Ideal for **instruction-following tasks**, such as:
- Explanation
- Summarization
- List generation
- Creative writing
- Can be adapted to **custom domains** (health, code, manufacturing) by adding your own prompts + responses.
---
## 🚀 Example Prompt
Instruction:
Give three tips to improve time management.
---
## 🧪 Try it Out
To use this model in your own project:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("howtomakepplragequit/phi2-lora-instruct")
tokenizer = AutoTokenizer.from_pretrained("howtomakepplragequit/phi2-lora-instruct")
input_text = "### Instruction:\nExplain how machine learning works.\n\n### Response:"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
output = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(output[0], skip_special_tokens=True))