Create README.md

84f473e over 2 years ago

1 kB

metadata

license: mit
language:
  - en
pipeline_tag: text-classification
tags:
  - pytorch
  - reward_model
  - transformers
  - RLHF

Model Card for Model ID

This is part of the Chai reward-model series, using the GPT2 architecture with a classification head, optimising for a user accepting the completion generated by the base model.

Its training dataset consists of purely user-generated content retry_and_continue_50m_reward_model, where a user has the option to decline the generated response via the retry button or end the conversation.

Model Details

Developed by Chai Research
Model type: Transformer-based Classification Model
Language: English
Contact: to ask questions about this model, join the Chai Discord. For general correspondence: [email protected]