TypeScript SLM 1.5B
A specialized Small Language Model for TypeScript code generation and understanding, optimized for React, Next.js, Angular, and Node.js frameworks.
Model Description
This model is a fine-tuned version of Qwen/Qwen2.5-Coder-1.5B-Instruct using LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning. It has been trained on 2,000-8,000 high-quality TypeScript code samples focusing on modern web development frameworks.
Key Features:
- Specialized in TypeScript and popular frameworks (React, Next.js, Angular, Node.js)
- Quality-scored training dataset with proper type annotations
- Optimized for code completion, generation, and understanding tasks
- Efficient inference with LoRA adapters
Intended Use
Primary Use Cases
- TypeScript code completion and generation
- React component scaffolding
- Next.js API route and page generation
- Angular service and directive creation
- Node.js/Express backend code generation
- Type definition and interface creation
Out-of-Scope Use
- Production-critical code generation without human review
- Non-TypeScript/JavaScript code generation
- General-purpose text generation
- Code obfuscation or malicious code generation
Training Data
The model was trained on a curated dataset of TypeScript code samples with the following distribution:
- React (43-58%): Components, hooks, context, custom hooks
- Angular (33-50%): Services, directives, modules, dependency injection
- Next.js (21-23%): Pages, API routes, SSR, SSG patterns
- TypeScript (9-16%): Advanced types, generics, utility types
- Node.js (6-11%): Express, NestJS, API servers
Dataset Quality Scoring:
- Samples scored 41-64 on quality metrics
- Prioritizes proper type annotations
- Excludes test files, debug code, and incomplete modules
- Focuses on production-quality patterns from popular repositories
Training Procedure
Training Hyperparameters
Hardware:
- Google Colab A100 40GB GPU
- CUDA acceleration with FP16 precision
Configuration:
- Base Model: Qwen/Qwen2.5-Coder-1.5B-Instruct
- Training Samples: 2,000-8,000 (depending on dataset tier)
- Epochs: 3
- Batch Size: 4
- Gradient Accumulation Steps: 8
- Effective Batch Size: 32
- Learning Rate: 2e-4
- Max Sequence Length: 1024
- LoRA Rank (r): 32
- LoRA Alpha: 16
- LoRA Dropout: 0.1
- Target Modules: All linear layers
Training Time:
- train_small.jsonl (2k samples): ~20-30 minutes on A100
- train_medium.jsonl (5k samples): ~50-75 minutes on A100
- train.jsonl (8k samples): ~2-3 hours on A100
Usage
Basic Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
# Load base model and tokenizer
base_model = "Qwen/Qwen2.5-Coder-1.5B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
base_model,
device_map="auto",
torch_dtype="auto"
)
tokenizer = AutoTokenizer.from_pretrained(base_model)
# Load LoRA adapter
model = PeftModel.from_pretrained(model, "sylvester-francis/typescript-slm-1.5b")
# Generate code
prompt = """Write a React component that fetches user data and displays it in a card:
```typescript
"""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
do_sample=True,
top_p=0.95
)
generated_code = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_code)
React Component Generation
prompt = """Create a TypeScript React component with props for a user profile card:
```typescript
interface UserProfileProps {
"""
# Generate with the model...
Next.js API Route
prompt = """Write a Next.js API route for user authentication:
```typescript
// pages/api/auth/login.ts
"""
# Generate with the model...
Angular Service
prompt = """Create an Angular service for HTTP data fetching:
```typescript
import { Injectable } from '@angular/core';
"""
# Generate with the model...
Performance
Code Quality Metrics
- Proper TypeScript type annotations
- Framework-specific best practices
- Adherence to modern ES6+ patterns
- Clean, readable code structure
Generation Speed
- Average: ~50-100 tokens/second on A100
- Latency: <100ms for typical completions
- Memory: ~3GB VRAM with adapter loaded
Limitations
Specialized Domain: Works best for TypeScript and related frameworks. Performance degrades for other languages.
Training Data Bias: Reflects patterns from popular open-source repositories, which may not match all coding styles.
Context Length: Limited to 1024 tokens, which may be insufficient for very large files.
No Real-time Updates: Training data is static and doesn't include the latest framework versions or patterns.
Requires Human Review: Generated code should always be reviewed for security, correctness, and best practices.
Type Safety: While trained on typed code, generated types may not always be complete or optimal.
Ethical Considerations
- Code Licensing: Ensure generated code complies with your project's license requirements
- Security: Always review generated code for security vulnerabilities
- Testing: Generated code should be thoroughly tested before production use
- Attribution: Consider the training data sources when using generated code commercially
Training Infrastructure
Software Stack:
- PyTorch 2.9.0+cu126
- Transformers 4.57.2
- PEFT 0.18.0
- TRL 0.25.1
- Datasets 4.0.0
- bitsandbytes 0.41.0+
Platform:
- Google Colab Pro (recommended)
- Supports Mac M4 (MPS) for local training (slower)
- Compatible with T4, A100, and other CUDA GPUs
Repository
Full training code, dataset filtering, and usage examples: https://github.com/sylvester-francis/slm-typescript-model
Model Card Authors
- Sylvester Francis (@sylvester-francis)
Citations
Base Model
@article{qwen2.5,
title={Qwen2.5-Coder Technical Report},
author={Qwen Team},
year={2024},
journal={arXiv preprint},
url={https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct}
}
Training Framework
@misc{vonwerra2022trl,
title={{TRL: Transformer Reinforcement Learning}},
author={Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallou{\'e}dec},
year={2020},
journal={GitHub repository},
publisher={GitHub},
howpublished={\url{https://github.com/huggingface/trl}}
}
LoRA
@article{hu2021lora,
title={LoRA: Low-Rank Adaptation of Large Language Models},
author={Hu, Edward J and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu},
journal={arXiv preprint arXiv:2106.09685},
year={2021}
}
License
MIT License - See repository for full license text.
Acknowledgments
- Built on Qwen 2.5 Coder by Alibaba Cloud
- Training powered by Hugging Face TRL and PEFT libraries
- Dataset curated from high-quality open-source TypeScript projects
- Downloads last month
- 26
Model tree for sylvester-francis/typescript-slm-1.5b
Base model
Qwen/Qwen2.5-1.5B