Aqui-open0-2 Lite: Efficient 1.72B Open Weights Reasoning Model

image/png

Aqui-open0-2 Lite is a compact yet powerful 1.72 billion parameter open weights reasoning model from Aqui Solutions, creators of AquiGPT. Fine-tuned on Qwen3 1.7B, this model delivers exceptional performance that rivals much larger models while being highly accessible for consumer hardware and edge deployment.

Key Features

  • Compact Architecture: 1.72B parameters fine-tuned on Qwen3 1.7B base
  • Outstanding Performance: Competitive with larger models in key benchmarks
  • 8-bit Precision: Optimized for efficiency without sacrificing quality
  • 40K Context Window: Expandable to 128K using YARN scaling
  • Strong Reasoning: Exceptional performance in instruction following and multilingual tasks
  • Open Weights: Fully open under Apache 2.0 license
  • Consumer-Friendly: Runs on modest hardware setups

Performance Benchmarks

Aqui-open0-2 Lite demonstrates exceptional performance across multiple challenging benchmarks, significantly outperforming other models in its size class:

Benchmark Aqui-open0-2 Lite (1.72B) Gemma 3 (1B) Qwen3 (2.03B) Llama 3.2 (1.24B) LFM2 (1.17B)
MMLU (General Knowledge) 67.5% 40.1% 59.1% 46.6% 55.2%
GPQA (Science) 31.8% 19.2% 27.7% 19.6% 31.5%
IFEval (Instruction Following) 73.4% 62.9% 68.4% 52.4% 74.5%
GSM8K (Grade School Math) 63.2% 59.6% 51.4% 35.7% 58.3%
MGSM (Multilingual) 70.2% 43.6% 66.6% 29.1% 55.0%
Average Performance 61.2% 45.1% 54.6% 36.7% 54.9%

Bold: Best performance, Italics: Second best

Model Specifications

  • Parameters: 1.72 billion
  • Base Model: Qwen3 1.7B
  • Context Window: 40,000 tokens (expandable to 128K with YARN)
  • Precision: 8-bit optimized
  • Architecture: Qwen transformer
  • Languages: 23+ languages with strong multilingual support
  • Knowledge Cutoff: October 2024

Hardware Requirements

Minimum Requirements

  • GPU: GTX 1660 (6GB VRAM) or RTX 3060
  • Mac: 8GB unified memory (Apple Silicon)
  • RAM: 8GB system memory
  • Storage: 4GB available space

Recommended Setup

  • GPU: RTX 3070 or RTX 4060 (8GB+)
  • CPU: Modern quad-core processor
  • RAM: 16GB+ for optimal performance
  • Storage: NVMe SSD for faster loading

Edge Deployment

  • Mobile: Capable of running on high-end mobile devices
  • Raspberry Pi: Compatible with Pi 5 with sufficient RAM
  • Embedded: Suitable for edge AI applications

Installation & Usage

Quick Start with Transformers

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

# Load model and tokenizer
model_name = "aquigpt/open0-2-lite"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True
)

# Generate response
prompt = "Write a Python function to implement binary search with detailed comments."
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(
    **inputs, 
    max_length=1024, 
    temperature=0.7,
    do_sample=True
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

Using with vLLM

from vllm import LLM, SamplingParams

# Initialize model
llm = LLM(
    model="aquigpt/open0-2-lite",
    tensor_parallel_size=1,
    trust_remote_code=True
)

# Set sampling parameters
sampling_params = SamplingParams(
    temperature=0.7,
    top_p=0.9,
    max_tokens=512
)

# Generate
prompts = ["Explain quantum computing in simple terms."]
outputs = llm.generate(prompts, sampling_params)
print(outputs[0].outputs[0].text)

Use Cases

Educational & Learning

  • Grade school mathematics assistance (GSM8K: 63.2%)
  • General knowledge queries (MMLU: 67.5%)
  • Multilingual learning support (MGSM: 70.2%)
  • Instruction following for educational tasks

Lightweight Development

  • Code generation for simple to moderate tasks
  • Algorithm implementation
  • Code review and debugging
  • Technical documentation

Edge AI Applications

  • On-device assistance
  • Offline reasoning tasks
  • Mobile app integration
  • IoT and embedded systems

Multilingual Support

  • Cross-language communication
  • Translation assistance
  • Multilingual content creation
  • Cultural context understanding

Quantization Options

Available quantization formats for different hardware setups:

  • BF16: ~3.4GB VRAM (full precision)
  • FP16: ~3.4GB VRAM (recommended)
  • INT8: ~1.7GB VRAM (efficient)
  • INT4: ~0.9GB VRAM (ultra-efficient for edge)

Fine-tuning Support

Aqui-open0-2 Lite supports various fine-tuning approaches:

  • LoRA/QLoRA: Parameter-efficient fine-tuning
  • Full Fine-tuning: Complete model adaptation
  • Custom Tokenizer: Domain-specific vocabulary
  • Multi-task Learning: Specialized task combinations

Comparison with Other Small Models

Aqui-open0-2 Lite significantly outperforms other models in its size class:

  • 67.5% MMLU: vs 59.1% (Qwen3 2B) and 55.2% (LFM2)
  • 73.4% IFEval: Leading instruction following performance
  • 70.2% MGSM: Superior multilingual capabilities
  • Efficiency: Best performance per parameter in class

Limitations

  • Knowledge cutoff at October 2024
  • May occasionally produce hallucinations
  • Limited compared to larger models for highly complex reasoning
  • 8-bit precision may impact some edge cases
  • Context extension reduces efficiency

License

This model is released under the Apache 2.0 License, enabling both research and commercial applications without restrictions.

Ethical Considerations

Aqui-open0-2 Lite is designed for beneficial applications. Users should:

  • Implement appropriate safety measures for production use
  • Consider bias mitigation in sensitive applications
  • Follow responsible AI practices
  • Respect applicable laws and regulations

Support & Community

Acknowledgments

  • Qwen Team: built the base model, Qwen3 1.7B;
  • HuggingFace: hosting the model weights.

Copyright 2025 Aqui Solutions. All rights reserved.

Downloads last month
28
Safetensors
Model size
1.72B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aquigpt/open0-2-lite

Finetuned
Qwen/Qwen3-1.7B
Finetuned
(212)
this model
Quantizations
2 models

Collection including aquigpt/open0-2-lite