metadata
base_model: teknium/OpenHermes-2.5-Mistral-7B
tags:
- mistral
- instruct
- finetune
- chatml
- gpt4
- synthetic data
- distillation
- dpo
- rlhf
license: apache-2.0
language:
- en
dataset: argilla/distilabel-math-preference-dpo
This model was finetuned with DPO technique. The goal was to experiment if the base models capabilities in mathematics can be increased.