togethercomputer
/

RedPajama-INCITE-7B-Base

Text Generation

text-generation-inference

Model card Files Files and versions

danfu09 commited on May 5, 2023

Commit

870f6da

·

1 Parent(s): 8fd22c7

Add training details

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -119,11 +119,12 @@ Please refer to [togethercomputer/RedPajama-Data-1T](https://huggingface.co/data
 **Training Procedure**
-- **Hardware:** TODO @Dan
-- **Optimizer:**
-- **Gradient Accumulations**:
 - **Num of Tokens:** 800B Tokens
-- **Learning rate:**
 ## Community

 **Training Procedure**
+- **Hardware:** 512 nodes of 6xV100 (IBM Power9), on the OLCF Summit cluster
+- **Optimizer:** Apex FusedAdam
+- **Parallelism:** Pipeline parallel 12, model parallel 2
+- **Gradient Accumulations**: 8 (global batch size 4M tokens)
 - **Num of Tokens:** 800B Tokens
+- **Learning rate:** 0.00012
 ## Community