AdamDE commited on
Commit
c7daeae
·
verified ·
1 Parent(s): 4fb90d5

Update README.md with improved model card

Browse files
Files changed (1) hide show
  1. README.md +103 -4
README.md CHANGED
@@ -6,16 +6,115 @@ tags:
6
  - lora
7
  - adapters
8
  - tinyllama
 
 
 
 
9
  ---
10
 
11
  # TinyLlama YouTube Replies (LoRA)
12
 
13
- LoRA adapters to fine-tune `TinyLlama-1.1B-Chat-v1.0` for generating concise, friendly replies to AI/ML YouTube comments.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  ## Usage
 
 
 
16
  ```python
 
17
  from transformers import AutoTokenizer, AutoModelForCausalLM
18
  from peft import PeftModel
19
- base = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
20
- tok = AutoTokenizer.from_pretrained("your-username/tinyllama-youtube-replies-lora")
21
- model = PeftModel.from_pretrained(base, "your-username/tinyllama-youtube-replies-lora")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - lora
7
  - adapters
8
  - tinyllama
9
+ - youtube
10
+ - conversational
11
+ - text-generation
12
+ license: apache-2.0
13
  ---
14
 
15
  # TinyLlama YouTube Replies (LoRA)
16
 
17
+ This model is a **LoRA fine-tuned** version of [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0), designed to generate **concise, friendly, and domain-specific replies** to YouTube comments on AI/ML-related content. Using Low-Rank Adaptation (LoRA), this project demonstrates the ability to fine-tune a lightweight language model for conversational tasks. While the model may occasionally produce out-of-context replies and could benefit from further optimization, it effectively showcases a functional fine-tuning pipeline.
18
+
19
+ ## Model Details
20
+
21
+ - **Base Model**: [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0)
22
+ - **Fine-Tuning Method**: LoRA (Low-Rank Adaptation)
23
+ - **Task**: Generating short, engaging replies to AI/ML YouTube comments
24
+ - **Language**: English
25
+ - **License**: Apache 2.0
26
+
27
+ ## Intended Use
28
+
29
+ This model is intended for:
30
+ - Generating polite and engaging replies to AI/ML-related YouTube comments.
31
+ - Demonstrating a fine-tuning project using LoRA for lightweight adaptation.
32
+ - Research or educational purposes in conversational AI.
33
+
34
+ **Not Intended For**:
35
+ - Production environments without further optimization.
36
+ - Non-English text generation.
37
+ - Applications requiring high contextual accuracy without human review.
38
 
39
  ## Usage
40
+
41
+ To use this model, you need the `transformers` and `peft` libraries. Below is an example of how to load and generate replies:
42
+
43
  ```python
44
+ import torch
45
  from transformers import AutoTokenizer, AutoModelForCausalLM
46
  from peft import PeftModel
47
+
48
+ # Load the base model, tokenizer, and LoRA adapters
49
+ base_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
50
+ adapter_id = "AdamDE/tinyllama-custom-youtube-replies"
51
+ tokenizer = AutoTokenizer.from_pretrained(adapter_id)
52
+ base_model = AutoModelForCausalLM.from_pretrained(base_id, torch_dtype=torch.float16, device_map="auto")
53
+ model = PeftModel.from_pretrained(base_model, adapter_id)
54
+
55
+ # Prepare input
56
+ messages = [
57
+ {"role": "system", "content": "You are an AI/ML tutorial creator replying to YouTube comments. "
58
+ "Provide concise, friendly, and domain-specific help, encourage engagement, "
59
+ "and keep a positive tone with occasional emojis when appropriate."},
60
+ {"role": "user", "content": "Your enthusiasm is contagious!"}
61
+ ]
62
+ inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt").to(model.device)
63
+
64
+ # Generate reply
65
+ with torch.no_grad():
66
+ out = model.generate(inputs, max_new_tokens=128, temperature=0.7, top_p=0.9, pad_token_id=tokenizer.eos_token_id)
67
+ reply = tokenizer.decode(out[0], skip_special_tokens=True)
68
+ print(reply)
69
+ # Example output: "Haha, thanks! 😂 What's your favorite part?"
70
+ ```
71
+
72
+ ### Requirements
73
+ ```bash
74
+ pip install transformers peft torch
75
+ ```
76
+
77
+ ### Notes
78
+ - Use a clear, comment-like prompt for best results.
79
+ - Adjust `max_new_tokens`, `temperature`, and `top_p` to control reply length and creativity.
80
+ - The model may occasionally generate out-of-context replies, indicating room for further optimization.
81
+
82
+ ## Training Details
83
+
84
+ - **Dataset**: Custom JSON dataset of AI/ML YouTube comments and replies, split into train, validation, and test sets.
85
+ - **Training Procedure**: LoRA fine-tuning with 4-bit quantization (NF4) and mixed precision (bf16/fp16).
86
+ - **Hyperparameters**:
87
+ - LoRA Rank (r): 16
88
+ - LoRA Alpha: 32
89
+ - LoRA Dropout: 0.05
90
+ - Epochs: 5
91
+ - Learning Rate: 2e-4
92
+ - Optimizer: Paged AdamW 8-bit
93
+ - Scheduler: Cosine with 10% warmup
94
+ - **Evaluation Metrics**:
95
+ - BLEU and ROUGE scores computed on the test set (see training script for details).
96
+ - **Training Features**:
97
+ - Gradient checkpointing for memory efficiency.
98
+ - Early stopping with patience of 2 epochs based on validation loss.
99
+ - Custom data collator for padding and label masking.
100
+
101
+ ## Performance
102
+
103
+ The model achieves reasonable performance for a fine-tuning project, with BLEU and ROUGE scores indicating decent reply quality. However, occasional out-of-context responses suggest potential improvements in dataset quality or hyperparameter tuning.
104
+
105
+ ## Limitations
106
+
107
+ - May generate out-of-context or generic replies, requiring human review.
108
+ - Optimized for AI/ML YouTube comments; performance may vary for other domains.
109
+ - Limited to English-language inputs and outputs.
110
+
111
+ ## Ethical Considerations
112
+
113
+ - Generated replies should be reviewed to ensure they are appropriate and constructive.
114
+ - Use responsibly to foster positive community interactions.
115
+
116
+ ## Contact
117
+
118
+ For questions or feedback, please contact [Your Contact Info, e.g., GitHub, email, or Hugging Face profile].
119
+
120
+ ---