howtomakepplragequit
/

phi2-lora-instruct

generated-from-train

instruction-tuned

Model card Files Files and versions

phi2-lora-instruct / README.md

howtomakepplragequit's picture

howtomakepplragequit

updated readme

f64d846 verified 4 months ago

|

history blame contribute delete

2.4 kB

	---
	license: mit
	tags:
	- generated-from-train
	- instruction-tuned
	- phi2
	- lora
	- low-resource
	- fine-tuning
	datasets:
	- yahma/alpaca-cleaned
	base_model: microsoft/phi-2
	widget:
	- text: "### Instruction:\nExplain the concept of gravity.\n\n### Response:"
	---

	# 🧠 phi2-lora-instruct

	This is a LoRA fine-tuned version of Microsoft’s Phi-2 model trained on 500 examples from the [`yahma/alpaca-cleaned`](https://huggingface.co/datasets/yahma/alpaca-cleaned) instruction dataset.

	### ✅ Fine-Tuned by:
	[howtomakepplragequit](https://huggingface.co/howtomakepplragequit) — working on scalable, efficient LLM training for real-world instruction-following.

	---

	## 🏗️ Model Architecture

	- Base model: [microsoft/phi-2](https://huggingface.co/microsoft/phi-2) (2.7B parameters)
	- Adapter: LoRA (Low-Rank Adaptation), trained with [PEFT](https://github.com/huggingface/peft)
	- Quantization: 4-bit NF4 via `bitsandbytes` for efficient memory use

	---

	## 📦 Dataset

	- [`yahma/alpaca-cleaned`](https://huggingface.co/datasets/yahma/alpaca-cleaned)
	- Instruction-based Q&A for natural language understanding and generation
	- Covers topics like science, grammar, everyday tasks, and reasoning

	---

	## 🛠️ Training Details

	- Training platform: Google Colab (Free T4 GPU)
	- Epochs: 2
	- Batch size: 2 (with gradient accumulation)
	- Optimizer: AdamW (via Transformers `Trainer`)
	- Training time: ~20–30 mins

	---

	## 📈 Intended Use

	- Ideal for instruction-following tasks, such as:
	- Explanation
	- Summarization
	- List generation
	- Creative writing
	- Can be adapted to custom domains (health, code, manufacturing) by adding your own prompts + responses.

	---

	## 🚀 Example Prompt

	Instruction:
	Give three tips to improve time management.

	---

	## 🧪 Try it Out

	To use this model in your own project:

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained("howtomakepplragequit/phi2-lora-instruct")
	tokenizer = AutoTokenizer.from_pretrained("howtomakepplragequit/phi2-lora-instruct")

	input_text = "### Instruction:\nExplain how machine learning works.\n\n### Response:"
	inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
	output = model.generate(**inputs, max_new_tokens=100)
	print(tokenizer.decode(output[0], skip_special_tokens=True))