simonveitner's picture
Update README.md
51eaa12
|
raw
history blame
374 Bytes
metadata
base_model: teknium/OpenHermes-2.5-Mistral-7B
tags:
  - mistral
  - instruct
  - finetune
  - chatml
  - gpt4
  - synthetic data
  - distillation
  - dpo
  - rlhf
license: apache-2.0
language:
  - en
dataset: argilla/distilabel-math-preference-dpo

This model was finetuned with DPO technique. The goal was to experiment if the base models capabilities in mathematics can be increased.