TinyLlama-1.1B-Chat-LoRA-Fused-v1.0 — Natural-Language-to-SQL

TinyLlama-1.1B-Chat-LoRA-Fused-v1.0 is a 1.1 billion parameter model derived from TinyLlama/TinyLlama-1.1B-Chat-v1.0.
Using parameter-efficient LoRA fine-tuning and the new Apple-Silicon-native MLX framework, the model has been specialised to convert plain-English questions into syntactically correct SQL queries for relational databases.
After training, the LoRA adapters were merged (“fused”) into the base weights, so you only need this single checkpoint for inference.


🗝️ Key Facts

Property Value
Base model TinyLlama 1.1B Chat v1.0
Task Natural-Language → SQL generation
Fine-tuning method Low-Rank Adaptation (LoRA) @ rank = 16
Training framework MLX 0.8 + PEFT
Hardware MacBook Pro M4 Pro (20-core GPU)
Checkpoint size 2.1 GB (fp16, fused)
License Apache 2.0

✨ Intended Use

  • Interactive data exploration inside BI notebooks or chatbots.
  • Customer-support analytics — empower non-SQL users to ask free-form questions.
  • Education & demos showing how LoRA + MLX enables rapid on-device fine-tuning.

The model was trained on synthetic NL-SQL pairs for demo purposes. Do not deploy it in production for mission-critical SQL generation without additional evaluation on your own schema and security review.


💻 Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0"
tok = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

prompt = """\
### Database schema
table orders(id, customer_id, total, created_at)
table customers(id, name, country)

### Question
List total sales per country ordered by total descending."""

inputs = tok(prompt, return_tensors="pt")
sql_out = model.generate(**inputs, max_new_tokens=128)
print(tok.decode(sql_out[0], skip_special_tokens=True))

🏋️‍♂️ Training Details

  • Data – 10 K synthetic NL/SQL pairs auto-generated from the open-domain schema list, then manually spot-checked for correctness.
  • Pre-processing – schema + question paired using the Text-to-SQL prompt pattern; SQL statements lower-cased; no anonymisation.
  • Hyper-parameters
    • batch size = 32 (gradient-accum = 4)
    • learning-rate = 2 e-4 (cosine schedule)
    • epochs = 3
    • LoRA rank = 16, α = 32
    • fp16 mixed-precision

Total GPU-hours ≈ 5mins on Apple-Silicon.


🌱 Environmental Impact

LoRA fine-tuning on consumer Apple-Silicon is energy-efficient.


🛠️ Limitations & Biases

  • Trained on a synthetic, limited dataset → may under-perform on real production schemas.
  • Does not perform schema-linking; you must include the relevant schema in the prompt.
  • SQL is not guaranteed to be safe; always validate queries before execution.

✍️ Citation

@misc{mohanan2024tinyllama_sql_lora,
  title   = {TinyLlama-1.1B-Chat-LoRA-Fused-v1.0},
  author  = {Jerome Mohanan},
  note    = {Hugging Face repository: https://huggingface.co/jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0},
  year    = {2024}
}

📫 Contact

Questions or feedback? Ping @jero2rome on Hugging Face or email [email protected].

Downloads last month
49
Safetensors
Model size
1.1B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0

Adapter
(1130)
this model

Dataset used to train jero2rome/tinyllama-1.1b-chat-lora-fused-v1.0