|  | --- | 
					
						
						|  | license: apache-2.0 | 
					
						
						|  | language: | 
					
						
						|  | - en | 
					
						
						|  | base_model: | 
					
						
						|  | - TinyLlama/TinyLlama-1.1B-Chat-v1.0 | 
					
						
						|  | tags: | 
					
						
						|  | - lora | 
					
						
						|  | - fused | 
					
						
						|  | - text-to-sql | 
					
						
						|  | - natural-language-to-sql | 
					
						
						|  | - mlx | 
					
						
						|  | - apple-silicon | 
					
						
						|  | - fine-tuning | 
					
						
						|  | - instruction-following | 
					
						
						|  | model_creator: Jerome Mohanan | 
					
						
						|  | datasets: | 
					
						
						|  | - spider | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | # TinyLlama-1.1B-Chat-LoRA-Fused-v1.0 — Natural-Language-to-SQL | 
					
						
						|  |  | 
					
						
						|  | **TinyLlama-1.1B-Chat-LoRA-Fused-v1.0** is a 1.1 billion parameter model derived from *TinyLlama/TinyLlama-1.1B-Chat-v1.0*. | 
					
						
						|  | Using parameter-efficient **LoRA** fine-tuning and the new Apple-Silicon-native **MLX** framework, the model has been specialised to convert plain-English questions into syntactically correct SQL queries for relational databases. | 
					
						
						|  | After training, the LoRA adapters were **merged (“fused”)** into the base weights, so you only need this single checkpoint for inference. | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## 🗝️ Key Facts | 
					
						
						|  | | Property | Value | | 
					
						
						|  | |---|---| | 
					
						
						|  | | Base model | TinyLlama 1.1B Chat v1.0 | | 
					
						
						|  | | Task | Natural-Language → SQL generation | | 
					
						
						|  | | Fine-tuning method | Low-Rank Adaptation (LoRA) @ rank = 16 | | 
					
						
						|  | | Training framework | MLX 0.8 + PEFT | | 
					
						
						|  | | Hardware | MacBook Pro M4 Pro (20-core GPU) | | 
					
						
						|  | | Checkpoint size | 2.1 GB (fp16, fused) | | 
					
						
						|  | | License | Apache 2.0 | | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## ✨ Intended Use | 
					
						
						|  |  | 
					
						
						|  | * **Interactive data exploration** inside BI notebooks or chatbots. | 
					
						
						|  | * **Customer-support analytics** — empower non-SQL users to ask free-form questions. | 
					
						
						|  | * **Education & demos** showing how LoRA + MLX enables rapid on-device fine-tuning. | 
					
						
						|  |  | 
					
						
						|  | The model was trained on synthetic NL-SQL pairs for demo purposes. **Do not** deploy it in production for mission-critical SQL generation without additional evaluation on your own schema and security review. | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## 💻 Quick Start | 
					
						
						|  | ```python | 
					
						
						|  | from transformers import AutoModelForCausalLM, AutoTokenizer | 
					
						
						|  |  | 
					
						
						|  | model_id = "jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0" | 
					
						
						|  | tok = AutoTokenizer.from_pretrained(model_id) | 
					
						
						|  | model = AutoModelForCausalLM.from_pretrained(model_id) | 
					
						
						|  |  | 
					
						
						|  | prompt = """\ | 
					
						
						|  | ### Database schema | 
					
						
						|  | table orders(id, customer_id, total, created_at) | 
					
						
						|  | table customers(id, name, country) | 
					
						
						|  |  | 
					
						
						|  | ### Question | 
					
						
						|  | List total sales per country ordered by total descending.""" | 
					
						
						|  |  | 
					
						
						|  | inputs = tok(prompt, return_tensors="pt") | 
					
						
						|  | sql_out = model.generate(**inputs, max_new_tokens=128) | 
					
						
						|  | print(tok.decode(sql_out[0], skip_special_tokens=True)) | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## 🏋️♂️ Training Details | 
					
						
						|  |  | 
					
						
						|  | * **Data** – 10 K synthetic NL/SQL pairs auto-generated from the open-domain schema list, then manually spot-checked for correctness. | 
					
						
						|  | * **Pre-processing** – schema + question paired using the *Text-to-SQL prompt* pattern; SQL statements lower-cased; no anonymisation. | 
					
						
						|  | * **Hyper-parameters** | 
					
						
						|  | * batch size = 32 (gradient-accum = 4) | 
					
						
						|  | * learning-rate = 2 e-4 (cosine schedule) | 
					
						
						|  | * epochs = 3 | 
					
						
						|  | * LoRA rank = 16, α = 32 | 
					
						
						|  | * fp16 mixed-precision | 
					
						
						|  |  | 
					
						
						|  | Total GPU-hours ≈ 5mins on Apple-Silicon. | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## 🌱 Environmental Impact | 
					
						
						|  |  | 
					
						
						|  | LoRA fine-tuning on consumer Apple-Silicon is energy-efficient. | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## 🛠️ Limitations & Biases | 
					
						
						|  |  | 
					
						
						|  | * Trained on a synthetic, limited dataset → may under-perform on real production schemas. | 
					
						
						|  | * Does **not** perform schema-linking; you must include the relevant schema in the prompt. | 
					
						
						|  | * SQL is not guaranteed to be safe; always validate queries before execution. | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## ✍️ Citation | 
					
						
						|  | ``` | 
					
						
						|  | @misc{mohanan2024tinyllama_sql_lora, | 
					
						
						|  | title   = {TinyLlama-1.1B-Chat-LoRA-Fused-v1.0}, | 
					
						
						|  | author  = {Jerome Mohanan}, | 
					
						
						|  | note    = {Hugging Face repository: https://huggingface.co/jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0}, | 
					
						
						|  | year    = {2024} | 
					
						
						|  | } | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | --- | 
					
						
						|  |  | 
					
						
						|  | ## 📫 Contact | 
					
						
						|  | Questions or feedback? Ping **@jero2rome** on Hugging Face or email <[email protected]>. | 
					
						
						|  |  |