A newer version of this model is available:
kxdw2580/DeepSeek-R1-0528-Qwen3-8B-catgirl-v2.5
We have released the new v2-qwen dataset to evaluate performance advantages on large-scale models.
Due to significant hallucination issues in the common subset, the results were not satisfactory.
Additionally, during fine-tuning, LoRA + bitsandbytes 8-bit quantization was employed to accelerate training. The model's efficiency may be compromised compared to fully-precision models.
- Downloads last month
- 7
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for kxdw2580/DeepSeek-R1-0528-Qwen3-8B-Catgirl-0531-test-all
Base model
deepseek-ai/DeepSeek-R1-0528-Qwen3-8B