Translation
Safetensors
mistral

Which model should I use?

#14
by laelhalawani - opened

Which model should I use for regular translations.
I see the example uses PPO, but the prompt (especially CoT) seems more like an instruct-type of prompt.
If PPO is best to use here as opposed to Instruct, what example would you say is good for instruct type of model?

ByteDance Seed org

@laelhalawani Thanks for your interest! We recommend using the PPO model with beam search decoding, which is consistent with the prompt of the Instruct model. If you need to further fine-tune the model or train it with reinforcement learning, we recommend using the Instruct model.

Thank you very much!

Sign up or log in to comment