|
--- |
|
license: mit |
|
tags: |
|
- generated-from-train |
|
- instruction-tuned |
|
- phi2 |
|
- lora |
|
- low-resource |
|
- fine-tuning |
|
datasets: |
|
- yahma/alpaca-cleaned |
|
base_model: microsoft/phi-2 |
|
widget: |
|
- text: "### Instruction:\nExplain the concept of gravity.\n\n### Response:" |
|
--- |
|
|
|
# 🧠 phi2-lora-instruct |
|
|
|
This is a **LoRA fine-tuned version of Microsoft’s Phi-2** model trained on 500 examples from the [`yahma/alpaca-cleaned`](https://huggingface.co/datasets/yahma/alpaca-cleaned) instruction dataset. |
|
|
|
### ✅ Fine-Tuned by: |
|
**[howtomakepplragequit](https://huggingface.co/howtomakepplragequit)** — working on scalable, efficient LLM training for real-world instruction-following. |
|
|
|
--- |
|
|
|
## 🏗️ Model Architecture |
|
|
|
- **Base model**: [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) (2.7B parameters) |
|
- **Adapter**: LoRA (Low-Rank Adaptation), trained with [PEFT](https://github.com/huggingface/peft) |
|
- **Quantization**: 4-bit NF4 via `bitsandbytes` for efficient memory use |
|
|
|
--- |
|
|
|
## 📦 Dataset |
|
|
|
- [`yahma/alpaca-cleaned`](https://huggingface.co/datasets/yahma/alpaca-cleaned) |
|
- Instruction-based Q&A for natural language understanding and generation |
|
- Covers topics like science, grammar, everyday tasks, and reasoning |
|
|
|
--- |
|
|
|
## 🛠️ Training Details |
|
|
|
- **Training platform**: Google Colab (Free T4 GPU) |
|
- **Epochs**: 2 |
|
- **Batch size**: 2 (with gradient accumulation) |
|
- **Optimizer**: AdamW (via Transformers `Trainer`) |
|
- **Training time**: ~20–30 mins |
|
|
|
--- |
|
|
|
## 📈 Intended Use |
|
|
|
- Ideal for **instruction-following tasks**, such as: |
|
- Explanation |
|
- Summarization |
|
- List generation |
|
- Creative writing |
|
- Can be adapted to **custom domains** (health, code, manufacturing) by adding your own prompts + responses. |
|
|
|
--- |
|
|
|
## 🚀 Example Prompt |
|
|
|
Instruction: |
|
Give three tips to improve time management. |
|
|
|
--- |
|
|
|
## 🧪 Try it Out |
|
|
|
To use this model in your own project: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model = AutoModelForCausalLM.from_pretrained("howtomakepplragequit/phi2-lora-instruct") |
|
tokenizer = AutoTokenizer.from_pretrained("howtomakepplragequit/phi2-lora-instruct") |
|
|
|
input_text = "### Instruction:\nExplain how machine learning works.\n\n### Response:" |
|
inputs = tokenizer(input_text, return_tensors="pt").to("cuda") |
|
output = model.generate(**inputs, max_new_tokens=100) |
|
print(tokenizer.decode(output[0], skip_special_tokens=True)) |