Safetensors
English
llama
craffel HF Staff commited on
Commit
dfe6bd9
·
verified ·
1 Parent(s): 18f8aaa

Fix syntax in README

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -1,4 +1,4 @@
1
- # ---
2
  license: apache-2.0
3
  datasets:
4
  - common-pile/comma_v0.1_training_dataset
@@ -26,7 +26,7 @@ It performs comparably to budget-matched models (7 billion parameters, 1 trillio
26
  Comma v0.1 is a decoder-only transformer that uses the same architecture as Llama 3.
27
  Training was done in two stages: first on 965 billion tokens with a cosine learning rate schedule, and second a "cool-down" training phase on 35 billion tokens from high-quality sources.
28
  The final model is the average of 10 checkpoints during this cool-down phase.
29
- Training was performed using [https://github.com/facebookresearch/lingua/](lingua) on 64 Nvidia H100 GPUs.
30
  Hyperparameters can be found in our [lingua config file](https://huggingface.co/common-pile/comma-v0.1-checkpoints/blob/main/config.yaml).
31
 
32
  ## Limitations
 
1
+ ---
2
  license: apache-2.0
3
  datasets:
4
  - common-pile/comma_v0.1_training_dataset
 
26
  Comma v0.1 is a decoder-only transformer that uses the same architecture as Llama 3.
27
  Training was done in two stages: first on 965 billion tokens with a cosine learning rate schedule, and second a "cool-down" training phase on 35 billion tokens from high-quality sources.
28
  The final model is the average of 10 checkpoints during this cool-down phase.
29
+ Training was performed using [lingua](https://github.com/facebookresearch/lingua/) on 64 Nvidia H100 GPUs.
30
  Hyperparameters can be found in our [lingua config file](https://huggingface.co/common-pile/comma-v0.1-checkpoints/blob/main/config.yaml).
31
 
32
  ## Limitations