Qwen2-96M

Qwen2-96M is a small language model based on the Qwen2 architecture, trained from scratch on English datasets with a context length of 8192 tokens. With only 96 million parameters, this model serves as a lightweight base model that can be fine-tuned for specific tasks.

Due to its compact size, the model has significant limitations in reasoning, factual knowledge, and general capabilities compared to larger models. It may produce incorrect, irrelevant, or nonsensical outputs. Additionally, as it was trained on internet text data, it may contain biases and potentially generate inappropriate content.

Usage

pip install transformers==4.49.0 torch==2.6.0

from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
import torch

model_path = "Felladrin/Qwen2-96M"
prompt = "I've been thinking about"
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForCausalLM.from_pretrained(model_path).to(device)
streamer = TextStreamer(tokenizer)
inputs = tokenizer(prompt, return_tensors="pt").to(device)
model.generate(
    inputs.input_ids,
    attention_mask=inputs.attention_mask,
    max_length=tokenizer.model_max_length,
    streamer=streamer,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id,
    repetition_penalty=1.1,
    do_sample=True,
    temperature=0.7,
    top_p=0.9,
    top_k=0,
    min_p=0.1,
)

Downloads last month: 11

Safetensors

Model size

96.2M params

Tensor type

BF16

Model tree for Felladrin/Qwen2-96M

Quantizations

3 models

Datasets used to train Felladrin/Qwen2-96M

Collection including Felladrin/Qwen2-96M

Foundation Text-Generation Models Below 360M Parameters

Collection

Great candidates for fine-tuning targeting Wllama and Transformers.js for mobile devices, ordered by number of parameters. • 42 items • Updated Jan 25 • 41

Qwen2-96M

Usage

Model tree for Felladrin/Qwen2-96M

Datasets used to train Felladrin/Qwen2-96M

Collection including Felladrin/Qwen2-96M

🎉 Free Image Generator Now Available!