File size: 2,862 Bytes
			
			| 1663c44 d86cb57 818684c af4e7c6 0719e04 af4e7c6 818684c d86cb57 af4e7c6 7e22977 d86cb57 af4e7c6 0719e04 d86cb57 af4e7c6 7e22977 d86cb57 7e22977 af4e7c6 0719e04 d86cb57 af4e7c6 0719e04 d86cb57 818684c d86cb57 1663c44 d86cb57 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 | ---
license: apache-2.0
tags:
  - qlora
  - tinyllama
  - cli
  - command-line
  - fine-tuning
  - low-resource
  - internship
  - fenrir
model_type: TinyLlamaForCausalLM
base_model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
datasets:
  - custom-cli-qa
library_name: peft
pipeline_tag: text-generation
---
# 🔧 CLI LoRA TinyLLaMA Fine-Tuning (Fenrir Internship Project)
🚀 This repository presents a **LoRA fine-tuned version of TinyLLaMA-1.1B-Chat** trained on a custom dataset of CLI Q&A. Developed as part of a 24-hour AI/ML internship task by **Fenrir Security Pvt Ltd**, this lightweight model functions as a domain-specific command-line assistant.
---
## 📁 Dataset
A curated collection of 200+ real-world CLI Q&A pairs covering:
- Git (branching, stash, merge, rebase)
- Bash (variables, loops, file manipulation)
- `grep`, `tar`, `gzip` (command syntax, flags)
- Python environments (`venv`, pip)
Stored in `cli_questions.json`.
---
## ⚙️ Model Details
| Field              | Value                                      |
|-------------------|--------------------------------------------|
| Base Model         | `TinyLlama/TinyLlama-1.1B-Chat-v1.0`       |
| Fine-Tuning Method | QLoRA via `peft`                           |
| Epochs             | 3 (with early stopping)                    |
| Adapter Size       | ~7MB (LoRA weights only)                   |
| Hardware           | Local CPU (low-resource)                   |
| Tokenizer          | Inherited from base model                  |
---
## 📊 Evaluation
| Metric                     | Result         |
|----------------------------|----------------|
| Accuracy on Eval Set       | ~92%           |
| Manual Review              | High relevance |
| Hallucination Rate         | Very low       |
| Inference Time (CPU)       | < 1s / query   |
All results are stored in `eval_results.json`.
---
## 🧠 Files Included
- `adapter_model.safetensors` — fine-tuned LoRA weights
- `adapter_config.json` — LoRA hyperparameters
- `training.ipynb` — complete training notebook
- `agent.py` — CLI interface to test the model
- `cli_questions.json` — training dataset
- `eval_results.json` — eval results
- `requirements.txt` — dependencies
---
## 📦 Inference Example
```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
base_model = AutoModelForCausalLM.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
tokenizer = AutoTokenizer.from_pretrained("TinyLlama/TinyLlama-1.1B-Chat-v1.0")
peft_model = PeftModel.from_pretrained(base_model, "Harish2002/cli-lora-tinyllama")
peft_model.eval()
prompt = "How do I initialize a new Git repository?"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = peft_model.generate(**inputs, max_new_tokens=64)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 |