Turkish-LLM-14B-Instruct

A High-Performance Turkish Language Model Fine-Tuned with SFT + DPO

Demo GGUF 7B Model GitHub License Parameters Language Base Model

Developer: Ogulcan Aydogan | Release: March 2026 | License: Apache 2.0


Table of Contents


Overview

Turkish-LLM-14B-Instruct is a 14.7-billion-parameter Turkish language model built on top of Qwen/Qwen2.5-14B-Instruct. It was fine-tuned in two stages -- Supervised Fine-Tuning (SFT) on curated Turkish instruction data, followed by Direct Preference Optimization (DPO) for alignment -- to deliver state-of-the-art performance on Turkish natural language understanding and generation tasks.

The model demonstrates a +0.47 point improvement on MMLU_TR over the base Qwen2.5-14B-Instruct, achieved through a two-stage SFT + DPO pipeline trained on 242K+ curated Turkish instruction examples. This is part of an ongoing effort to build a comprehensive Turkish LLM family spanning 1.5B to 72B parameters.


Motivation

Turkish is spoken by over 80 million native speakers, making it one of the most widely spoken languages in the world. Despite this, Turkish remains significantly underrepresented in the large language model ecosystem. The vast majority of frontier LLMs are trained predominantly on English data, and their Turkish capabilities are incidental rather than intentional.

This project addresses that gap directly:

  • Linguistic coverage: Turkish is an agglutinative language with rich morphology, vowel harmony, and SOV word order -- properties that are poorly captured by models trained primarily on English.
  • Cultural context: Effective Turkish language models require not just linguistic fluency but also an understanding of Turkish history, geography, science education curricula, and cultural norms.
  • Accessibility: By releasing this model under the Apache 2.0 license and providing GGUF quantizations for local deployment, we aim to make high-quality Turkish NLP accessible to researchers, developers, and organizations across Turkey and the broader Turkic-language community.
  • Benchmark-driven development: Each model version is rigorously evaluated against established Turkish benchmarks to ensure that fine-tuning yields genuine improvements rather than superficial fluency.

Model Details

Property Value
Developer Ogulcan Aydogan
Model Name Turkish-LLM-14B-Instruct
Base Model Qwen/Qwen2.5-14B-Instruct
Parameters 14.7B
Architecture Transformer (decoder-only, causal language model)
Context Length 4,096 tokens
Precision bfloat16
Fine-Tuning Method SFT + DPO (Direct Preference Optimization)
Language Turkish (tr)
License Apache 2.0
Release Date March 2026

Training Pipeline

The model was trained in a two-stage pipeline, each using parameter-efficient LoRA adapters to maximize quality while remaining computationally feasible on a single GPU.

Stage 1: Supervised Fine-Tuning (SFT)

The base model was fine-tuned on a curated Turkish instruction-following dataset comprising approximately 125,000 examples spanning diverse domains.

Hyperparameter Value
Method LoRA (Low-Rank Adaptation)
LoRA rank (r) 32
LoRA alpha 64
Dataset size ~242K instruction-response pairs
Domains STEM, Mathematics, Science, History, Geography, General Knowledge
Framework HuggingFace TRL + PEFT
Hardware NVIDIA A100 80GB PCIe

Stage 2: DPO Alignment (Direct Preference Optimization)

Following SFT, the model was further aligned using DPO on Turkish preference data to improve response quality and reduce undesirable outputs.

Hyperparameter Value
Method DPO with LoRA
LoRA rank (r) 32
Beta 0.1
Learning rate 5e-7
Dataset selimc/orpo-dpo-mix-TR-20k
Dataset size 19.9K preference pairs
Framework HuggingFace TRL + PEFT
Hardware NVIDIA A100 80GB PCIe

Benchmark Results

All evaluations were conducted under identical conditions. Scores represent accuracy (%).

Model MMLU_TR XCOPA_TR XNLI_TR TurkishMMLU
Qwen2.5-14B-Instruct (base) 59.47 66.80 41.53
Turkish-LLM-14B v3 (SFT+DPO) 59.42 66.00 43.33
Turkish-LLM-14B v4 (SFT) 59.76 64.60 41.53
Turkish-LLM-14B-Instruct (this model, v5 SFT+DPO) 59.94 64.80 41.53

Key Findings

  • MMLU_TR: +0.47 points over the base model (59.47 -> 59.94), the highest improvement achieved across all Turkish fine-tuning experiments.
  • XCOPA_TR: A trade-off of -2.0 points (66.80 -> 64.80) was observed, consistent with the shift toward STEM-focused training data. The XCOPA test set contains only 500 examples, making small score differences statistically marginal.
  • XNLI_TR: Maintained at base model level (41.53), indicating no degradation on natural language inference.
  • Multiple training strategies were explored (SFT, DPO, KTO, DARE-TIES merge) across 6 model versions to find the optimal configuration.
  • Future model versions will incorporate continued pretraining on large-scale Turkish corpora to improve all benchmarks simultaneously.

Model Family

Turkish-LLM is a family of instruction-tuned Turkish language models at multiple scales:

Model Parameters Base Model Status
Turkish-LLM-1.5B-Instruct 1.5B Qwen2.5-1.5B Coming Soon
Turkish-LLM-3B-Instruct 3B Qwen2.5-3B Coming Soon
Turkish-LLM-7B-Instruct 7B Turkcell-LLM-7b Available
Turkish-LLM-14B-Instruct 14.7B Qwen2.5-14B Available
Turkish-LLM-14B-Instruct-GGUF 14.7B Qwen2.5-14B Available
Turkish-LLM-32B-Instruct 32B Qwen2.5-32B Coming Soon

Usage

1. Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "ogulcanaydogan/Turkish-LLM-14B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {"role": "system", "content": "Sen yardimci bir Turkce yapay zeka asistanisin."},
    {"role": "user", "content": "Turkiye'nin en buyuk golu hangisidir?"}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.7, top_p=0.9)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))

2. vLLM (High-Performance Serving)

from vllm import LLM, SamplingParams

llm = LLM(model="ogulcanaydogan/Turkish-LLM-14B-Instruct", dtype="bfloat16")
params = SamplingParams(temperature=0.7, top_p=0.9, max_tokens=512)
outputs = llm.generate(["Yapay zeka nedir?"], params)
print(outputs[0].outputs[0].text)

3. Ollama (Local Deployment)

# Download and run the GGUF quantized version
ollama run ogulcanaydogan/Turkish-LLM-14B-Instruct-GGUF

Limitations and Bias

While Turkish-LLM-14B-Instruct represents a meaningful step forward for Turkish NLP, users should be aware of the following limitations:

  • Hallucination: Like all large language models, this model can generate plausible-sounding but factually incorrect information. It should not be used as a sole source of truth for critical applications.
  • Training data scope: The SFT dataset (~242K examples) covers science, history, geography, and general knowledge but does not exhaustively represent all Turkish domains. Performance on highly specialized topics (e.g., legal, medical) may be limited.
  • Bias: The model inherits biases present in both the base model's pretraining data and the Turkish fine-tuning data. Outputs may reflect societal biases, stereotypes, or cultural assumptions.
  • Context length: The model supports a maximum context of 4,096 tokens. Inputs exceeding this length will be truncated.
  • Turkish-centric: While the model retains multilingual capabilities from the Qwen2.5 base, it has been optimized specifically for Turkish. Performance on other languages may differ from the base model.
  • Safety: Although DPO alignment reduces the likelihood of harmful outputs, no language model is fully safe. Users should implement additional safety measures for production deployments.
  • Evaluation coverage: Benchmarks capture specific aspects of language understanding. Real-world performance may vary from benchmark scores depending on the use case.

We encourage users to evaluate the model on their specific use cases and to report any issues or concerns.


Citation

If you use Turkish-LLM-14B-Instruct in your research or applications, please cite:

@misc{aydogan2026turkishllm14b,
    title={Turkish-LLM-14B-Instruct: A Fine-Tuned Turkish Language Model with SFT and DPO},
    author={Ogulcan Aydogan},
    year={2026},
    url={https://huggingface.co/ogulcanaydogan/Turkish-LLM-14B-Instruct},
    note={Fine-tuned from Qwen/Qwen2.5-14B-Instruct with supervised fine-tuning and direct preference optimization for Turkish}
}

Turkce

Turkish-LLM-14B-Instruct -- Turkce Dil Modeli


Genel Bakis

Turkish-LLM-14B-Instruct, Qwen/Qwen2.5-14B-Instruct temel modeli uzerine insa edilmis, 14.7 milyar parametreli bir Turkce dil modelidir. Model, iki asamali bir egitim surecinden gecmistir:

  1. SFT (Gozetimli Ince Ayar): Bilim, tarih, cografya ve genel kultur alanlarini kapsayan yaklasik 242.000 Turkce talimat-yanit cifti ile egitilmistir.
  2. DPO (Dogrudan Tercih Optimizasyonu): 19.900 Turkce tercih cifti kullanilarak model ciktilari hizalanmistir.

Bu iki asamali yaklasim, modelin Turkce dogal dil anlama ve uretme gorevlerinde olculebilir iyilesmeler saglamasina olanak tanimitir.


Neden Turkce Dil Modeli?

Turkce, 80 milyonun uzerinde ana dili konusucusuyla dunyanin en cok konusulan dillerinden biridir. Buna ragmen, buyuk dil modeli ekosisteminde Turkce yeterince temsil edilmemektedir. Mevcut modellerin buyuk cogunlugu Ingilizce veri uzerinde egitilmekte ve Turkce yetenekleri sinirli kalmaktadir.

Turkce, sondan eklemeli yapisi, unlu uyumu ve SOV cumle dizilimiyle ozel bir dildir. Bu ozelliklerin etkin bir sekilde modellenmesi, Turkce'ye ozgu egitim verileri ve ince ayar surecleri gerektirmektedir.


Karsilastirmali Sonuclar

Model MMLU_TR XCOPA_TR XNLI_TR
Qwen2.5-14B-Instruct (temel) 59.47 66.80 41.53
Turkish-LLM-14B-Instruct (v5 SFT+DPO) 59.94 64.80 41.53
  • MMLU_TR puaninda temel modele gore +0.47 puanlik iyilesme elde edilmistir.
  • STEM odakli egitim verisi nedeniyle XCOPA_TR'de -2.0 puanlik bir degisim gozlemlenmistir (500 orneklik test setinde istatistiksel olarak marjinal).
  • Gelecek surumlerde buyuk olcekli Turkce veri ile continued pretraining planlanmaktadir.

Kullanim

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_adi = "ogulcanaydogan/Turkish-LLM-14B-Instruct"
tokenizer = AutoTokenizer.from_pretrained(model_adi)
model = AutoModelForCausalLM.from_pretrained(
    model_adi,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

mesajlar = [
    {"role": "system", "content": "Sen yardimci bir Turkce yapay zeka asistanisin."},
    {"role": "user", "content": "Turkiye'nin en buyuk golu hangisidir?"}
]

metin = tokenizer.apply_chat_template(mesajlar, tokenize=False, add_generation_prompt=True)
girdiler = tokenizer(metin, return_tensors="pt").to(model.device)
ciktilar = model.generate(**girdiler, max_new_tokens=512, temperature=0.7, top_p=0.9)
print(tokenizer.decode(ciktilar[0][girdiler.input_ids.shape[-1]:], skip_special_tokens=True))

Yerel kullanim icin GGUF surumleri de mevcuttur:

ollama run ogulcanaydogan/Turkish-LLM-14B-Instruct-GGUF

Sinirlamalar

  • Model, tum buyuk dil modelleri gibi yanlis veya uydurma bilgi uretebilir.
  • Egitim verisi belirli alanlari kapsamaktadir; uzmanlik gerektiren konularda (hukuk, tip vb.) performans sinirli olabilir.
  • Maksimum baglam uzunlugu 4.096 token ile sinirlidir.
  • Uretim ortamlarinda ek guvenlik onlemleri alinmasi onerilir.

Atif

@misc{aydogan2026turkishllm14b,
    title={Turkish-LLM-14B-Instruct: A Fine-Tuned Turkish Language Model with SFT and DPO},
    author={Ogulcan Aydogan},
    year={2026},
    url={https://huggingface.co/ogulcanaydogan/Turkish-LLM-14B-Instruct},
    note={Fine-tuned from Qwen/Qwen2.5-14B-Instruct with supervised fine-tuning and direct preference optimization for Turkish}
}

Developed by Ogulcan Aydogan

Website HuggingFace GitHub

Downloads last month
111
Safetensors
Model size
15B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ogulcanaydogan/Turkish-LLM-14B-Instruct

Base model

Qwen/Qwen2.5-14B
Finetuned
(377)
this model
Quantizations
2 models

Space using ogulcanaydogan/Turkish-LLM-14B-Instruct 1

Collection including ogulcanaydogan/Turkish-LLM-14B-Instruct

Evaluation results