Commit
·
ae814bf
1
Parent(s):
11a7c1b
Add exact param counts
Browse files
README.md
CHANGED
|
@@ -265,7 +265,9 @@ Please see [the BLOOM training README](https://github.com/bigscience-workshop/bi
|
|
| 265 |
|
| 266 |
* ALiBI positional encodings (see [paper](https://arxiv.org/pdf/2108.12409.pdf)), with GeLU activation functions
|
| 267 |
|
| 268 |
-
* 176
|
|
|
|
|
|
|
| 269 |
|
| 270 |
* 70 layers, 112 attention heads
|
| 271 |
|
|
|
|
| 265 |
|
| 266 |
* ALiBI positional encodings (see [paper](https://arxiv.org/pdf/2108.12409.pdf)), with GeLU activation functions
|
| 267 |
|
| 268 |
+
* 176,247,271,424 parameters:
|
| 269 |
+
|
| 270 |
+
* 3,596,615,680 embedding parameters
|
| 271 |
|
| 272 |
* 70 layers, 112 attention heads
|
| 273 |
|