README.md · simonveitner/MathHermes-2.5-Mistral-7B at 51eaa12a3291e16c90dc296eee867ab96c3a3e63

MathHermes-2.5-Mistral-7B / README.md

simonveitner

Update README.md

51eaa12 about 2 years ago

preview code

raw

history blame

374 Bytes

metadata

base_model: teknium/OpenHermes-2.5-Mistral-7B
tags:
  - mistral
  - instruct
  - finetune
  - chatml
  - gpt4
  - synthetic data
  - distillation
  - dpo
  - rlhf
license: apache-2.0
language:
  - en
dataset: argilla/distilabel-math-preference-dpo

This model was finetuned with DPO technique. The goal was to experiment if the base models capabilities in mathematics can be increased.