|
--- |
|
library_name: peft |
|
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0 |
|
pipeline_tag: text-generation |
|
tags: |
|
- lora |
|
- adapters |
|
- tinyllama |
|
- youtube |
|
- conversational |
|
- text-generation |
|
license: apache-2.0 |
|
--- |
|
|
|
# TinyLlama YouTube Replies (LoRA) |
|
|
|
This model is a **LoRA fine-tuned** version of [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0), designed to generate **concise, friendly, and domain-specific replies** to YouTube comments on AI/ML-related content. Using Low-Rank Adaptation (LoRA), this project demonstrates the ability to fine-tune a lightweight language model for conversational tasks. While the model may occasionally produce out-of-context replies and could benefit from further optimization, it effectively showcases a functional fine-tuning pipeline. |
|
|
|
## Model Details |
|
|
|
- **Base Model**: [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) |
|
- **Fine-Tuning Method**: LoRA (Low-Rank Adaptation) |
|
- **Task**: Generating short, engaging replies to AI/ML YouTube comments |
|
- **Language**: English |
|
- **License**: Apache 2.0 |
|
|
|
## Intended Use |
|
|
|
This model is intended for: |
|
- Generating polite and engaging replies to AI/ML-related YouTube comments. |
|
- Demonstrating a fine-tuning project using LoRA for lightweight adaptation. |
|
- Research or educational purposes in conversational AI. |
|
|
|
**Not Intended For**: |
|
- Production environments without further optimization. |
|
- Non-English text generation. |
|
- Applications requiring high contextual accuracy without human review. |
|
|
|
## Usage |
|
|
|
To use this model, you need the `transformers` and `peft` libraries. Below is an example of how to load and generate replies: |
|
|
|
```python |
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
from peft import PeftModel |
|
|
|
# Load the base model, tokenizer, and LoRA adapters |
|
base_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0" |
|
adapter_id = "AdamDE/tinyllama-custom-youtube-replies" |
|
tokenizer = AutoTokenizer.from_pretrained(adapter_id) |
|
base_model = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype=torch.float16, device_map="auto") |
|
model = PeftModel.from_pretrained(base_model, adapter_id) |
|
|
|
# Prepare input |
|
messages = [ |
|
{"role": "system", "content": "You are an AI/ML tutorial creator replying to YouTube comments. " |
|
"Provide concise, friendly, and domain-specific help, encourage engagement, " |
|
"and keep a positive tone with occasional emojis when appropriate."}, |
|
{"role": "user", "content": "Your enthusiasm is contagious!"} |
|
] |
|
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device) |
|
|
|
# Generate reply |
|
with torch.no_grad(): |
|
out = model.generate(inputs, max_new_tokens=128, temperature=0.7, top_p=0.9, pad_token_id=tokenizer.eos_token_id) |
|
reply = tokenizer.decode(out[0], skip_special_tokens=True) |
|
print(reply) |
|
# Example output: "Haha, thanks! 😂 What's your favorite part?" |
|
``` |
|
|
|
### Requirements |
|
```bash |
|
pip install transformers peft torch |
|
``` |
|
|
|
### Notes |
|
- Use a clear, comment-like prompt for best results. |
|
- Adjust `max_new_tokens`, `temperature`, and `top_p` to control reply length and creativity. |
|
- The model may occasionally generate out-of-context replies, indicating room for further optimization. |
|
|
|
## Training Details |
|
|
|
- **Dataset**: Custom JSON dataset of AI/ML YouTube comments and replies, split into train, validation, and test sets. |
|
- **Training Procedure**: LoRA fine-tuning with 4-bit quantization (NF4) and mixed precision (bf16/fp16). |
|
- **Hyperparameters**: |
|
- LoRA Rank (r): 16 |
|
- LoRA Alpha: 32 |
|
- LoRA Dropout: 0.05 |
|
- Epochs: 5 |
|
- Learning Rate: 2e-4 |
|
- Optimizer: Paged AdamW 8-bit |
|
- Scheduler: Cosine with 10% warmup |
|
- **Evaluation Metrics**: |
|
- BLEU and ROUGE scores computed on the test set (see training script for details). |
|
- **Training Features**: |
|
- Gradient checkpointing for memory efficiency. |
|
- Early stopping with patience of 2 epochs based on validation loss. |
|
- Custom data collator for padding and label masking. |
|
|
|
## Performance |
|
|
|
The model achieves reasonable performance for a fine-tuning project, with BLEU and ROUGE scores indicating decent reply quality. However, occasional out-of-context responses suggest potential improvements in dataset quality or hyperparameter tuning. |
|
|
|
## Limitations |
|
|
|
- May generate out-of-context or generic replies, requiring human review. |
|
- Optimized for AI/ML YouTube comments; performance may vary for other domains. |
|
- Limited to English-language inputs and outputs. |
|
|
|
## Ethical Considerations |
|
|
|
- Generated replies should be reviewed to ensure they are appropriate and constructive. |
|
- Use responsibly to foster positive community interactions. |
|
|
|
|