A newer version of this model is available: kxdw2580/DeepSeek-R1-0528-Qwen3-8B-catgirl-v2.5

We have released the new v2-qwen dataset to evaluate performance advantages on large-scale models.

Due to significant hallucination issues in the common subset, the results were not satisfactory.

Additionally, during fine-tuning, LoRA + bitsandbytes 8-bit quantization was employed to accelerate training. The model's efficiency may be compromised compared to fully-precision models.

Downloads last month: 7

Safetensors

Model size

8.19B params

Tensor type

BF16

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for kxdw2580/DeepSeek-R1-0528-Qwen3-8B-Catgirl-0531-test-all

Base model

deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

Finetuned

(30)

this model

kxdw2580
/

DeepSeek-R1-0528-Qwen3-8B-Catgirl-0531-test-all

Model tree for kxdw2580/DeepSeek-R1-0528-Qwen3-8B-Catgirl-0531-test-all

Dataset used to train kxdw2580/DeepSeek-R1-0528-Qwen3-8B-Catgirl-0531-test-all