Floppanacci
/

DeepSeek-R1-Distill-Qwen-7B-Floppanacci

Text Generation

text-generation-inference

Model card Files Files and versions

DeepSeek-R1-Distill-Qwen-7B-Floppanacci / README.md

clement-cvll's picture

Update README.md

8685810 verified 5 months ago

|

history blame contribute delete

1.96 kB

	---
	library_name: transformers
	tags:
	- math
	- qwen2
	- aimo
	license: mit
	datasets:
	- Floppanacci/QWQ-LongCOT-AIMO
	base_model:
	- deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
	language:
	- en
	---

	# DeepSeek-R1-Distill-Qwen-7B Fine-tuned for AIMO Math Problems

	This model is a fine-tuned version of `deepseek-ai/DeepSeek-R1-Distill-Qwen-7B` on the [`Floppanacci/QWQ-LongCOT-AIMO`](https://huggingface.co/datasets/Floppanacci/QWQ-LongCOT-AIMO) dataset.

	## Model Description

	The model was fine-tuned to improve performance on mathematical reasoning tasks, particularly those involving step-by-step solutions (Chain-of-Thought) similar to problems found in the [AI Mathematical Olympiad (AIMO)](https://www.kaggle.com/competitions/ai-mathematical-olympiad-progress-prize-2) competition.

	It's trained on a dataset containing ~30k math questions paired with detailed solutions.

	An [AWQ quantized version](https://huggingface.co/Floppanacci/DeepSeek-R1-Distill-Qwen-7B-Floppanacci-AWQ) is also available for faster inference and reduced memory usage.

	## How to Use

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model_id = "Floppanacci/DeepSeek-R1-Distill-Qwen-7B-Floppanacci"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = AutoModelForCausalLM.from_pretrained(
	model_id,
	torch_dtype=torch.bfloat16, # or torch.float16
	device_map="auto"
	)

	# Example Prompt (adjust based on how the model expects input)
	prompt = "Question: What is the value of $2+2$? Answer:"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

	# Generate
	outputs = model.generate(**inputs, max_new_tokens=8192, temperature=0.7, do_sample=True)
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)

	print(response)
	```

	## Training Data

	The model was fine-tuned on the train split of the [`Floppanacci/QWQ-LongCOT-AIMO`](https://huggingface.co/datasets/Floppanacci/QWQ-LongCOT-AIMO) dataset (29.5k examples).