tinyllama-custom-youtube-replies / README.md

Update README.md

8271977 verified 8 days ago

4.77 kB

	---
	library_name: peft
	base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
	pipeline_tag: text-generation
	tags:
	- lora
	- adapters
	- tinyllama
	- youtube
	- conversational
	- text-generation
	license: apache-2.0
	---

	# TinyLlama YouTube Replies (LoRA)

	This model is a LoRA fine-tuned version of [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0), designed to generate concise, friendly, and domain-specific replies to YouTube comments on AI/ML-related content. Using Low-Rank Adaptation (LoRA), this project demonstrates the ability to fine-tune a lightweight language model for conversational tasks. While the model may occasionally produce out-of-context replies and could benefit from further optimization, it effectively showcases a functional fine-tuning pipeline.

	## Model Details

	- Base Model: [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0)
	- Fine-Tuning Method: LoRA (Low-Rank Adaptation)
	- Task: Generating short, engaging replies to AI/ML YouTube comments
	- Language: English
	- License: Apache 2.0

	## Intended Use

	This model is intended for:
	- Generating polite and engaging replies to AI/ML-related YouTube comments.
	- Demonstrating a fine-tuning project using LoRA for lightweight adaptation.
	- Research or educational purposes in conversational AI.

	Not Intended For:
	- Production environments without further optimization.
	- Non-English text generation.
	- Applications requiring high contextual accuracy without human review.

	## Usage

	To use this model, you need the `transformers` and `peft` libraries. Below is an example of how to load and generate replies:

	```python
	import torch
	from transformers import AutoTokenizer, AutoModelForCausalLM
	from peft import PeftModel

	# Load the base model, tokenizer, and LoRA adapters
	base_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
	adapter_id = "AdamDE/tinyllama-custom-youtube-replies"
	tokenizer = AutoTokenizer.from_pretrained(adapter_id)
	base_model = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype=torch.float16, device_map="auto")
	model = PeftModel.from_pretrained(base_model, adapter_id)

	# Prepare input
	messages = [
	{"role": "system", "content": "You are an AI/ML tutorial creator replying to YouTube comments. "
	"Provide concise, friendly, and domain-specific help, encourage engagement, "
	"and keep a positive tone with occasional emojis when appropriate."},
	{"role": "user", "content": "Your enthusiasm is contagious!"}
	]
	inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)

	# Generate reply
	with torch.no_grad():
	out = model.generate(inputs, max_new_tokens=128, temperature=0.7, top_p=0.9, pad_token_id=tokenizer.eos_token_id)
	reply = tokenizer.decode(out[0], skip_special_tokens=True)
	print(reply)
	# Example output: "Haha, thanks! 😂 What's your favorite part?"
	```

	### Requirements
	```bash
	pip install transformers peft torch
	```

	### Notes
	- Use a clear, comment-like prompt for best results.
	- Adjust `max_new_tokens`, `temperature`, and `top_p` to control reply length and creativity.
	- The model may occasionally generate out-of-context replies, indicating room for further optimization.

	## Training Details

	- Dataset: Custom JSON dataset of AI/ML YouTube comments and replies, split into train, validation, and test sets.
	- Training Procedure: LoRA fine-tuning with 4-bit quantization (NF4) and mixed precision (bf16/fp16).
	- Hyperparameters:
	- LoRA Rank (r): 16
	- LoRA Alpha: 32
	- LoRA Dropout: 0.05
	- Epochs: 5
	- Learning Rate: 2e-4
	- Optimizer: Paged AdamW 8-bit
	- Scheduler: Cosine with 10% warmup
	- Evaluation Metrics:
	- BLEU and ROUGE scores computed on the test set (see training script for details).
	- Training Features:
	- Gradient checkpointing for memory efficiency.
	- Early stopping with patience of 2 epochs based on validation loss.
	- Custom data collator for padding and label masking.

	## Performance

	The model achieves reasonable performance for a fine-tuning project, with BLEU and ROUGE scores indicating decent reply quality. However, occasional out-of-context responses suggest potential improvements in dataset quality or hyperparameter tuning.

	## Limitations

	- May generate out-of-context or generic replies, requiring human review.
	- Optimized for AI/ML YouTube comments; performance may vary for other domains.
	- Limited to English-language inputs and outputs.

	## Ethical Considerations

	- Generated replies should be reviewed to ensure they are appropriate and constructive.
	- Use responsibly to foster positive community interactions.