learn-abc
/

tinyllama-custom-quotes

@@ -6,202 +6,223 @@ tags:
 - base_model:adapter:TinyLlama/TinyLlama-1.1B-Chat-v1.0
 - lora
 - transformers
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Finetuned from model [optional]:** [More Information Needed]
-### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** [More Information Needed]
-- **Paper [optional]:** [More Information Needed]
-- **Demo [optional]:** [More Information Needed]
 ## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
 ### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
 ## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
 ### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.
 ## How to Get Started with the Model
-Use the code below to get started with the model.
-[More Information Needed]
 ## Training Details
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
 ### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-[More Information Needed]
-#### Training Hyperparameters
-- **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
 ## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-[More Information Needed]
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
 ### Results
-[More Information Needed]
-#### Summary
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
 ## Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
 ### Model Architecture and Objective
-[More Information Needed]
 ### Compute Infrastructure
-[More Information Needed]
-#### Hardware
-[More Information Needed]
-#### Software
-[More Information Needed]
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-## More Information [optional]
-[More Information Needed]
-## Model Card Authors [optional]
-[More Information Needed]
 ## Model Card Contact
-[More Information Needed]
-### Framework versions
-- PEFT 0.17.0

 - base_model:adapter:TinyLlama/TinyLlama-1.1B-Chat-v1.0
 - lora
 - transformers
+- text-generation
+- fine-tuned
+- quotes
+- tinyllama
 ---
+# Model Card for learn-abc/tinyllama-custom-quotes
+This model is a `PEFT (LoRA)` fine-tuned version of `TinyLlama/TinyLlama-1.1B-Chat-v1.0`. It has been specialized to act as an AI assistant that, when given an inspiring quote, provides the author's name, following a specific instruction-based chat format.
 ## Model Details
 ### Model Description
+This model is a specialized version of `TinyLlama-1.1B-Chat-v1.0`, fine-tuned using the `QLoRA` technique. The primary objective of this fine-tuning was to adapt the base LLM's behavior to a specific task: generating the author's name for a given inspiring quote. It adheres to a conversational instruction format, making it suitable for focused Q&A on a dataset of `quotes` and `authors`.
+* **Developed by:** The user (`learn-abc`)
+* **Model type:** Causal Language Model (Fine-tuned adapter)
+* **Language(s) (NLP):** English
+* **License:** MIT
+* **Finetuned from model:** TinyLlama/TinyLlama-1.1B-Chat-v1.0
+### Model Sources
+* **Repository:** https://huggingface.co/learn-abc/tinyllama-custom-quotes
 ## Uses
 ### Direct Use
+This model is intended for direct use in applications requiring highly specialized text generation for quotes. Specifically, it can be prompted with an inspiring quote in a predefined instruction format, and it will generate the corresponding author. It is ideal for:
+* Automated quote attribution systems.
+* Educational tools for learning about famous quotes.
+* Integrating a quote-lookup feature into a chatbot or application.
+### Downstream Use
+This fine-tuned adapter can be integrated into larger systems or applications that require accurate quote-to-author mapping. Examples include:
+* Enhancing content creation tools that deal with quotations.
+* Part of a larger RAG system where quotes need specific attribution.
+* Specialized virtual assistants focused on literary or motivational content.
 ### Out-of-Scope Use
+This model is not intended for:
+* Generating general conversational text or engaging in open-ended dialogue.
+* Providing factual information on topics outside of quote attribution.
+* Generating code or structured data (unless further fine-tuned for such tasks).
+* Use in high-stakes applications requiring absolute factual accuracy on diverse topics.
+* Generating creative text that is not related to existing quotes and authors.
 ## Bias, Risks, and Limitations
+This model inherits biases present in its base model, `TinyLlama/TinyLlama-1.1B-Chat-v1.0`, which was trained on a broad corpus. Additionally, biases from the `Abirate/english_quotes` dataset (e.g., disproportionate representation of certain authors, historical periods, or cultural perspectives) may be introduced or amplified.
+### Risks & Limitations:
+* **Limited Scope:** Its specialization means it will not perform well on general language tasks.
+* **Knowledge Cut-off:** While fine-tuned, its knowledge is primarily constrained to the quotes present in the training data. It will likely hallucinate or fail if asked about quotes or authors not in its training set.
+* **Short Context:** As TinyLlama is a smaller model, its effective context window may limit its ability to process very long quotes or complex instructions, although the fine-tuning format is designed to mitigate this.
+* **Hallucinations:** Despite fine-tuning, the model may still "hallucinate" authors for unknown quotes or misattribute known quotes if the input is ambiguous or outside its learned patterns.
 ### Recommendations
+Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. For critical applications, human review of generated outputs is recommended. It should primarily be used for its intended task of quote attribution based on the fine-tuning data. Developers should evaluate its performance on a representative dataset reflecting their specific use case to understand its limitations.
 ## How to Get Started with the Model
+To use this model for inference, you can load the base model and then load the PEFT adapters on top of it. Alternatively, you can directly load the merged model if it has been saved in a standalone format.
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
+from peft import PeftModel
+import torch
+# Define the model paths
+BASE_MODEL_NAME = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"
+FINE_TUNED_ADAPTER_PATH = "learn-abc/tinyllama-custom-quotes" # Your Hugging Face repo ID
+MERGED_MODEL_PATH = "/tinyllama_custom_quotes_fine_funed/merged_model" # If you have saved the merged model locally
+# Option 1: Load base model and then PEFT adapter (requires peft installed)
+# Load base model
+model = AutoModelForCausalLM.from_pretrained(
+    BASE_MODEL_NAME,
+    torch_dtype=torch.float16,
+    device_map="auto"
+)
+# Load fine-tuned adapter
+model = PeftModel.from_pretrained(model, FINE_TUNED_ADAPTER_PATH)
+model = model.merge_and_unload() # Merge adapters for easier inference
+# Option 2: Directly load the merged model if it was saved as a full model
+# model = AutoModelForCausalLM.from_pretrained(
+#     MERGED_MODEL_PATH,
+#     torch_dtype=torch.float16,
+#     device_map="auto"
+# )
+tokenizer = AutoTokenizer.from_pretrained(BASE_MODEL_NAME)
+tokenizer.pad_token = tokenizer.eos_token
+tokenizer.padding_side = "right"
+# Create a text generation pipeline
+generator = pipeline(task="text-generation", model=model, tokenizer=tokenizer)
+# Example usage
+test_quote = "The only way to do great work is to love what you do."
+formatted_prompt = f"""<s>[INST] <<SYS>>
+You are an AI assistant that is an expert in writing inspiring quotes. Your task is to provide an inspiring quote for the user based on the given concept, followed by the author's name.
+</SYS>>
+{test_quote} [/INST]"""
+result = generator(formatted_prompt, max_new_tokens=50, num_return_sequences=1)
+generated_text = result[0]['generated_text']
+print(f"Prompt: {test_quote}")
+print(f"Generated Author: {generated_text.split('[/INST]')[-1].strip()}")
+```
 ## Training Details
 ### Training Data
+The model was fine-tuned on a subset of the `Abirate/english_quotes` dataset. This dataset contains English quotes paired with their respective authors. The data was preprocessed to fit the Llama 2 chat instruction format, ensuring the model learned to map a given quote (as an "instruction") to its author (as the "response"). Each training sample was formatted as:
+```bash
+<s>[INST] <<SYS>>{system_prompt}</SYS>>\n\n{quote} [/INST] {author}</s>
+```
 ### Training Procedure
+The model was fine-tuned using the `QLoRA` (Quantized Low-Rank Adaptation) method, a parameter-efficient fine-tuning technique.
+### Preprocessing
+The `Abirate/english_quotes` dataset was loaded and a custom `format_instruction` function was applied to transform each quote-author pair into the Llama 2 chat template. The dataset was then tokenized using the `TinyLlama/TinyLlama-1.1B-Chat-v1.0` tokenizer, with truncation to `max_seq_length=512` and `right-padding`. Labels were created by copying the input IDs. The dataset was split into `90%` training and `10%` evaluation sets.
+### Training Hyperparameters
+* **Training regime:** bf16 mixed precision
+* **Optimizer:** paged_adamw_32bit
+* **Learning Rate:** 2e-4
+* **Weight Decay:** 0.001
+* **Gradient Norm Clipping:** 0.3
+* **Warmup Ratio:** 0.03
+* **Number of Epochs:** 1
+* **Per Device Train Batch Size:** 2
+* **Gradient Accumulation Steps:** 2 (resulting in an effective batch size of 4)
+* **LoRA Config:** r=64, lora_alpha=16, lora_dropout=0.1, bias="none", task_type="CAUSAL_LM"
+* **LoRA Target Modules:** ["q_proj", "v_proj", "k_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
+* **Gradient Checkpointing:** Enabled with `use_reentrant=False`
+### Speeds, Sizes, Times
+* **Model Parameters:** 1.1 Billion (base model)
+* **Trainable Parameters (LoRA):** Approximately `0.05%` of total parameters (specific number depends on exact model architecture).
+* **Training Time:** The training ran for approximately 1 hour and 35 minutes for 400 steps on the specified hardware.
+* **Checkpoint Size:** Only the PEFT adapters (small, typically in MBs) are saved during training, along with tokenizer files. The full merged model is saved once at the end.
 ## Evaluation
+### Testing Data
+The model was evaluated on a `10%` split of the `Abirate/english_quotes dataset`, which was held out from the training data. This validation set consists of tokenized quote-author pairs.
+### Factors
+Evaluation was performed across the entire validation dataset. No specific subpopulations or sub-domains were isolated for disaggregated analysis.
+### Metrics
+The primary metric used for evaluation during training was:
+* **eval_loss (Validation Loss):** A measure of how well the model predicts the next token on the unseen validation data. Lower values indicate better performance.
 ### Results
+During training, the eval_loss reached approximately `0.3576` at the end of the single epoch. This indicates the model learned effectively to predict the author given the quote in the specified format.
+### Summary
+The fine-tuning process successfully adapted the TinyLlama model to the task of quote attribution, as evidenced by the low validation loss. The model demonstrates the ability to generate the correct author for quotes it was fine-tuned on, following the Llama 2 chat instruction template.
 ## Environmental Impact
+Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
+* **Hardware Type:** NVIDIA GPU (e.g., `T4` or similar with ~14.57 GiB VRAM)
+* **Hours used:** ~1.6 hours
+* **Cloud Provider:** User's Cloud Provider (e.g., AWS EC2)
+* **Carbon Emitted:** ~50 - 100 grams of CO2eq (estimated based on typical cloud GPU power consumption and average grid emission factors for a short training run). This is a very low emission given the small model size and short training duration.
+## Technical Specifications
 ### Model Architecture and Objective
+The model is based on the `TinyLlama-1.1B-Chat-v1.0` architecture, which is a decoder-only transformer model similar to Llama 2. Its objective was fine-tuned to perform causal language modeling, specifically predicting the author token sequence following a given quote within a chat-based instruction prompt. The `QLoRA` method efficiently adapts this architecture by injecting low-rank adapters without fully retraining all original parameters.
 ### Compute Infrastructure
+### Hardware
+The fine-tuning was performed on a system equipped with an `NVIDIA GPU` having approximately `14.57 GiB` of VRAM.
+### Software
+* **Operating System:** Linux (e.g., Ubuntu)
+* **Python Version:** Python 3.12+
+* **Deep Learning Framework:** PyTorch
+* **Libraries:** Hugging Face transformers, datasets, peft, bitsandbytes, trl.
+## Citation
+### BibTeX:
+```bash
+@misc{tinyllama_custom_quotes_fine_tuned,
+  author = {learn-abc},
+  title = {TinyLlama Custom Quotes Fine-Tune},
+  year = {2025},
+  publisher = {Hugging Face},
+  journal = {Hugging Face Hub},
+  howpublished = {\url{https://huggingface.co/learn-abc/tinyllama-custom-quotes}}
+}
+```
+### APA:
+learn-abc. (2025). TinyLlama Custom Quotes Fine-Tune. Hugging Face. Retrieved from `https://huggingface.co/learn-abc/tinyllama-custom-quotes`
+## Glossary
+* **LoRA (Low-Rank Adaptation):** A parameter-efficient fine-tuning technique that adds small, trainable matrices (adapters) to a pre-trained model, significantly reducing the number of parameters that need to be updated during fine-tuning.
+* **QLoRA (Quantized LoRA):** An extension of LoRA that further reduces memory usage by quantizing the pre-trained model's weights to 4-bit precision during training.
+* **Causal Language Model:** A type of language model that predicts the next token in a sequence based only on the preceding tokens.
+* **PEFT (Parameter-Efficient Fine-Tuning):** A family of methods designed to fine-tune large models more efficiently by updating only a small subset of the model's parameters.
+* **Hallucination:** When an LLM generates plausible but factually incorrect or fabricated information.
 ## Model Card Contact
+### Contact Me
+For any inquiries or support, please reach out to:
+* **Author:** [Abhishek Singh](https://github.com/SinghIsWriting/)
+* **LinkedIn:** [My LinkedIn Profile](https://www.linkedin.com/in/abhishek-singh-bba2662a9)
+* **Portfolio:** [Abhishek Singh Portfolio](https://portfolio-abhishek-singh-nine.vercel.app/)
+Framework versions
+PEFT 0.17.0