|
--- |
|
library_name: transformers |
|
tags: |
|
- math |
|
- qwen2 |
|
- aimo |
|
license: mit |
|
datasets: |
|
- Floppanacci/QWQ-LongCOT-AIMO |
|
base_model: |
|
- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B |
|
language: |
|
- en |
|
--- |
|
|
|
# DeepSeek-R1-Distill-Qwen-7B Fine-tuned for AIMO Math Problems |
|
|
|
This model is a fine-tuned version of `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B` on the [`Floppanacci/QWQ-LongCOT-AIMO`](https://huggingface.co/datasets/Floppanacci/QWQ-LongCOT-AIMO) dataset. |
|
|
|
## Model Description |
|
|
|
The model was fine-tuned to improve performance on mathematical reasoning tasks, particularly those involving step-by-step solutions (Chain-of-Thought) similar to problems found in the [AI Mathematical Olympiad (AIMO)](https://www.kaggle.com/competitions/ai-mathematical-olympiad-progress-prize-2) competition. |
|
|
|
It's trained on a dataset containing ~30k math questions paired with detailed solutions. |
|
|
|
An [AWQ quantized version](https://huggingface.co/Floppanacci/DeepSeek-R1-Distill-Qwen-7B-Floppanacci-AWQ) is also available for faster inference and reduced memory usage. |
|
|
|
## How to Use |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
import torch |
|
|
|
model_id = "Floppanacci/DeepSeek-R1-Distill-Qwen-7B-Floppanacci" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_id, |
|
torch_dtype=torch.bfloat16, # or torch.float16 |
|
device_map="auto" |
|
) |
|
|
|
# Example Prompt (adjust based on how the model expects input) |
|
prompt = "Question: What is the value of $2+2$? Answer:" |
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
|
# Generate |
|
outputs = model.generate(**inputs, max_new_tokens=8192, temperature=0.7, do_sample=True) |
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
|
|
print(response) |
|
``` |
|
|
|
## Training Data |
|
|
|
The model was fine-tuned on the train split of the [`Floppanacci/QWQ-LongCOT-AIMO`](https://huggingface.co/datasets/Floppanacci/QWQ-LongCOT-AIMO) dataset (29.5k examples). |