jero2rome
/

tinyllama-1.1b-chat-lora-fused-v1.0

+---
+license: apache-2.0
+language:
+  - en
+base_model:
+  - TinyLlama/TinyLlama-1.1B-Chat-v1.0
+tags:
+  - lora
+  - fused
+  - text-to-sql
+  - natural-language-to-sql
+  - mlx
+  - apple-silicon
+  - fine-tuning
+  - instruction-following
+model_creator: Jerome Mohanan
+datasets:
+  - spider # used conceptually as inspiration; see Training Data
+---
+# TinyLlama-1.1B-Chat-LoRA-Fused-v1.0 — Natural-Language-to-SQL
+**TinyLlama-1.1B-Chat-LoRA-Fused-v1.0** is a 1.1 billion parameter model derived from *TinyLlama/TinyLlama-1.1B-Chat-v1.0*.
+Using parameter-efficient **LoRA** fine-tuning and the new Apple-Silicon-native **MLX** framework, the model has been specialised to convert plain-English questions into syntactically correct SQL queries for relational databases.
+After training, the LoRA adapters were **merged (“fused”)** into the base weights, so you only need this single checkpoint for inference.
+---
+## 🗝️ Key Facts
+| Property | Value |
+|---|---|
+| Base model | TinyLlama 1.1B Chat v1.0 |
+| Task | Natural-Language → SQL generation |
+| Fine-tuning method | Low-Rank Adaptation (LoRA) @ rank = 16 |
+| Training framework | MLX 0.8 + PEFT |
+| Hardware | MacBook Pro M4 Pro (20-core GPU) |
+| Checkpoint size | 2.1 GB (fp16, fused) |
+| License | Apache 2.0 |
+---
+## ✨ Intended Use
+* **Interactive data exploration** inside BI notebooks or chatbots.
+* **Customer-support analytics** — empower non-SQL users to ask free-form questions.
+* **Education & demos** showing how LoRA + MLX enables rapid on-device fine-tuning.
+The model was trained on synthetic NL-SQL pairs for demo purposes. **Do not** deploy it in production for mission-critical SQL generation without additional evaluation on your own schema and security review.
+---
+## 💻 Quick Start
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_id = "jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0"
+tok = AutoTokenizer.from_pretrained(model_id)
+model = AutoModelForCausalLM.from_pretrained(model_id)
+prompt = """\
+### Database schema
+table orders(id, customer_id, total, created_at)
+table customers(id, name, country)
+### Question
+List total sales per country ordered by total descending."""
+inputs = tok(prompt, return_tensors="pt")
+sql_out = model.generate(**inputs, max_new_tokens=128)
+print(tok.decode(sql_out[0], skip_special_tokens=True))
+```
+---
+## 🏋️‍♂️ Training Details
+* **Data** – 10 K synthetic NL/SQL pairs auto-generated from the open-domain schema list, then manually spot-checked for correctness.
+* **Pre-processing** – schema + question paired using the *Text-to-SQL prompt* pattern; SQL statements lower-cased; no anonymisation.
+* **Hyper-parameters**
+  * batch size = 32 (gradient-accum = 4)
+  * learning-rate = 2 e-4 (cosine schedule)
+  * epochs = 3
+  * LoRA rank = 16, α = 32
+  * fp16 mixed-precision
+Total GPU-hours ≈ 5mins on Apple-Silicon.
+---
+## 🌱 Environmental Impact
+LoRA fine-tuning on consumer Apple-Silicon is energy-efficient.
+---
+## 🛠️ Limitations & Biases
+* Trained on a synthetic, limited dataset → may under-perform on real production schemas.
+* Does **not** perform schema-linking; you must include the relevant schema in the prompt.
+* SQL is not guaranteed to be safe; always validate queries before execution.
+---
+## ✍️ Citation
+```
+@misc{mohanan2024tinyllama_sql_lora,
+  title   = {TinyLlama-1.1B-Chat-LoRA-Fused-v1.0},
+  author  = {Jerome Mohanan},
+  note    = {Hugging Face repository: https://huggingface.co/jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0},
+  year    = {2024}
+}
+```
+---
+## 📫 Contact
+Questions or feedback? Ping **@jero2rome** on Hugging Face or email <[email protected]>.