Instructions to use aungkomyint/tara1.3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use aungkomyint/tara1.3 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="aungkomyint/tara1.3")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("aungkomyint/tara1.3") model = AutoModelForCausalLM.from_pretrained("aungkomyint/tara1.3") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use aungkomyint/tara1.3 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "aungkomyint/tara1.3" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aungkomyint/tara1.3", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/aungkomyint/tara1.3
- SGLang
How to use aungkomyint/tara1.3 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "aungkomyint/tara1.3" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aungkomyint/tara1.3", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "aungkomyint/tara1.3" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aungkomyint/tara1.3", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use aungkomyint/tara1.3 with Docker Model Runner:
docker model run hf.co/aungkomyint/tara1.3
Tara 1.3
Tara 1.3 is a tiny experimental AI-engineering tool-call model. It is trained to read a short User: prompt and emit a compact JSON object that either selects a tool or answers with tool: "none".
This release is best treated as a research checkpoint for structured tool-routing experiments, not as a production assistant.
Model Details
- Model name:
tara1.3 - Internal checkpoint:
tara-1.3-ai-engineer-toolcall-sft-v4-plaintok-from300/checkpoint-100 - Architecture:
LlamaForCausalLM - Context length: 1,024 tokens
- Vocabulary size: 16,384
- Hidden size: 512
- Layers: 7
- Attention heads: 8
- Weights format:
safetensors - License: Apache-2.0
Capability
Tara 1.3 is designed to produce JSON-style tool calls for a small AI-engineering tool set.
Supported tool names:
weathersearchsegmentevaluate_modeltrain_sftinspect_fileextract_jsonnone
Example target shapes:
{"tool":"weather","location":"Bangkok tomorrow"}
{"tool":"search","query":"Python list comprehension examples"}
{"tool":"train_sft","base_model":"models/base","dataset":"data/train.txt","output_dir":"outputs/run"}
{"tool":"none","response":"Tokenizer validation passed. Next, run a small generation smoke test."}
Quick Start
import json
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
repo_id = "aungkomyint/tara1.3"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(repo_id)
model.eval()
def generate_tool_call(user_text):
prompt = f"User: {user_text.strip()}\nAssistant:\n"
inputs = tokenizer(prompt, return_tensors="pt")
inputs.pop("token_type_ids", None)
with torch.no_grad():
output = model.generate(
**inputs,
max_new_tokens=120,
do_sample=False,
repetition_penalty=1.08,
pad_token_id=tokenizer.pad_token_id,
eos_token_id=tokenizer.eos_token_id,
)
text = tokenizer.decode(output[0], skip_special_tokens=False)
reply = text[len(prompt):] if text.startswith(prompt) else text.split("Assistant:", 1)[-1]
reply = reply.split("<|endoftext|>", 1)[0].split("<|pad|>", 1)[0].strip()
return reply
reply = generate_tool_call("Search the web for Python list comprehension examples.")
print(reply)
try:
parsed = json.loads(reply)
print("tool:", parsed.get("tool"))
except json.JSONDecodeError:
print("Model did not return valid JSON for this sample.")
Recommended Prompt Format
Use a simple instruction format:
User: <request>
Assistant:
Greedy decoding is recommended for tool-call tests:
do_sample=False
max_new_tokens=120
repetition_penalty=1.08
Training Summary
Tara 1.3 was trained as a supervised fine-tuning continuation for AI-engineering tool calls.
Training configuration:
- Steps: 300
- Block size: 1,024
- Batch size: 8
- Gradient accumulation: 4
- Effective batch size: 32
- Learning rate: 5e-5
- Warmup steps: 15
- Weight decay: 0.01
- Loss mask: only the final Assistant response is trained; earlier turns are context
Dataset:
- Train examples: 8,945
- Eval examples: 777
- Mixture: no-tool/chat behavior plus capped tool-call examples
Local Evaluation
The released checkpoint was selected from a small local comparison on 2026-06-24.
The 10-prompt eval covered weather, search, segmentation, model evaluation, SFT training, file inspection, JSON extraction, and no-tool/general responses.
| Checkpoint | Valid JSON | Schema OK | Expected Tool Match |
|---|---|---|---|
checkpoint-100 |
6/10 | 5/10 | 5/10 |
checkpoint-200 |
5/10 | 5/10 | 5/10 |
checkpoint-300 |
5/10 | 5/10 | 5/10 |
checkpoint-100 was selected because it tied the other continued checkpoints on schema correctness and tool selection while producing one more valid JSON output.
Tokenizer validation passed: tool-call JSON tokenizes through the plain BPE vocabulary without old chat/tool special tokens.
Limitations
- This is a very small experimental model.
- It can emit malformed JSON.
- It can choose the right tool but fill arguments with copied or unrelated values.
- General
tool: "none"responses are unstable. - It is not reliable for autonomous tool execution without validation, repair, and fallback logic.
- It should not be used for medical, legal, financial, safety, or other high-stakes decisions.
Suggested Runtime Guardrails
Applications should:
- Parse the output with a JSON parser.
- Validate the
toolname against an allowlist. - Validate required fields for each tool.
- Reject or repair malformed JSON.
- Require user confirmation before destructive or external actions.
Citation
If you use this model, cite it as:
Aung Ko Myint. Tara 1.3. 2026. Hugging Face model checkpoint.
- Downloads last month
- 34