Lyte commited on
Commit
90a0b17
·
verified ·
1 Parent(s): 47fd80d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -99,6 +99,8 @@ This model was trained with GRPO, a method introduced in [DeepSeekMath: Pushing
99
 
100
  * Evaluations were conducted on the [Lyte/ConnectFour-T10](hf.co/datasets/Lyte/ConnectFour-T10) dataset's validation split to test whether the model learns to win by presenting it with a board showing only the winning position left.
101
 
 
 
102
 
103
  #### Summary Metrics Comparison
104
 
 
99
 
100
  * Evaluations were conducted on the [Lyte/ConnectFour-T10](hf.co/datasets/Lyte/ConnectFour-T10) dataset's validation split to test whether the model learns to win by presenting it with a board showing only the winning position left.
101
 
102
+ * evals sampling parameters are as follows:
103
+ * temperature=0.6, top_p=0.95, max_tokens=1024
104
 
105
  #### Summary Metrics Comparison
106