Narsil HF Staff commited on
Commit
29d738c
·
verified ·
1 Parent(s): 7c00466

Upload 3 files from dgpt2 - topology.json

Browse files
Files changed (1) hide show
  1. topology.json +4759 -0
topology.json ADDED
@@ -0,0 +1,4759 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "tensors": {
3
+ "h.3.ln_2.bias": {
4
+ "type": "Distributed",
5
+ "shape": [
6
+ 768
7
+ ],
8
+ "dtype": "F32",
9
+ "chunks": [
10
+ {
11
+ "offsets": [
12
+ 0
13
+ ],
14
+ "shape": [
15
+ 384
16
+ ],
17
+ "filename_index": 0
18
+ },
19
+ {
20
+ "offsets": [
21
+ 384
22
+ ],
23
+ "shape": [
24
+ 384
25
+ ],
26
+ "filename_index": 1
27
+ }
28
+ ]
29
+ },
30
+ "h.6.mlp.c_fc.weight": {
31
+ "type": "Distributed",
32
+ "shape": [
33
+ 768,
34
+ 3072
35
+ ],
36
+ "dtype": "F32",
37
+ "chunks": [
38
+ {
39
+ "offsets": [
40
+ 0,
41
+ 0
42
+ ],
43
+ "shape": [
44
+ 768,
45
+ 1536
46
+ ],
47
+ "filename_index": 0
48
+ },
49
+ {
50
+ "offsets": [
51
+ 0,
52
+ 1536
53
+ ],
54
+ "shape": [
55
+ 768,
56
+ 1536
57
+ ],
58
+ "filename_index": 1
59
+ }
60
+ ]
61
+ },
62
+ "h.3.attn.c_proj.weight": {
63
+ "type": "Distributed",
64
+ "shape": [
65
+ 768,
66
+ 768
67
+ ],
68
+ "dtype": "F32",
69
+ "chunks": [
70
+ {
71
+ "offsets": [
72
+ 0,
73
+ 0
74
+ ],
75
+ "shape": [
76
+ 384,
77
+ 768
78
+ ],
79
+ "filename_index": 0
80
+ },
81
+ {
82
+ "offsets": [
83
+ 384,
84
+ 0
85
+ ],
86
+ "shape": [
87
+ 384,
88
+ 768
89
+ ],
90
+ "filename_index": 1
91
+ }
92
+ ]
93
+ },
94
+ "h.4.mlp.c_proj.weight": {
95
+ "type": "Distributed",
96
+ "shape": [
97
+ 3072,
98
+ 768
99
+ ],
100
+ "dtype": "F32",
101
+ "chunks": [
102
+ {
103
+ "offsets": [
104
+ 0,
105
+ 0
106
+ ],
107
+ "shape": [
108
+ 1536,
109
+ 768
110
+ ],
111
+ "filename_index": 0
112
+ },
113
+ {
114
+ "offsets": [
115
+ 1536,
116
+ 0
117
+ ],
118
+ "shape": [
119
+ 1536,
120
+ 768
121
+ ],
122
+ "filename_index": 1
123
+ }
124
+ ]
125
+ },
126
+ "h.2.attn.c_attn.bias": {
127
+ "type": "Distributed",
128
+ "shape": [
129
+ 2304
130
+ ],
131
+ "dtype": "F32",
132
+ "chunks": [
133
+ {
134
+ "offsets": [
135
+ 0
136
+ ],
137
+ "shape": [
138
+ 1152
139
+ ],
140
+ "filename_index": 0
141
+ },
142
+ {
143
+ "offsets": [
144
+ 1152
145
+ ],
146
+ "shape": [
147
+ 1152
148
+ ],
149
+ "filename_index": 1
150
+ }
151
+ ]
152
+ },
153
+ "h.10.attn.c_attn.weight": {
154
+ "type": "Distributed",
155
+ "shape": [
156
+ 768,
157
+ 2304
158
+ ],
159
+ "dtype": "F32",
160
+ "chunks": [
161
+ {
162
+ "offsets": [
163
+ 0,
164
+ 0
165
+ ],
166
+ "shape": [
167
+ 768,
168
+ 1152
169
+ ],
170
+ "filename_index": 0
171
+ },
172
+ {
173
+ "offsets": [
174
+ 0,
175
+ 1152
176
+ ],
177
+ "shape": [
178
+ 768,
179
+ 1152
180
+ ],
181
+ "filename_index": 1
182
+ }
183
+ ]
184
+ },
185
+ "h.5.ln_2.bias": {
186
+ "type": "Distributed",
187
+ "shape": [
188
+ 768
189
+ ],
190
+ "dtype": "F32",
191
+ "chunks": [
192
+ {
193
+ "offsets": [
194
+ 0
195
+ ],
196
+ "shape": [
197
+ 384
198
+ ],
199
+ "filename_index": 0
200
+ },
201
+ {
202
+ "offsets": [
203
+ 384
204
+ ],
205
+ "shape": [
206
+ 384
207
+ ],
208
+ "filename_index": 1
209
+ }
210
+ ]
211
+ },
212
+ "ln_f.weight": {
213
+ "type": "Distributed",
214
+ "shape": [
215
+ 768
216
+ ],
217
+ "dtype": "F32",
218
+ "chunks": [
219
+ {
220
+ "offsets": [
221
+ 0
222
+ ],
223
+ "shape": [
224
+ 384
225
+ ],
226
+ "filename_index": 0
227
+ },
228
+ {
229
+ "offsets": [
230
+ 384
231
+ ],
232
+ "shape": [
233
+ 384
234
+ ],
235
+ "filename_index": 1
236
+ }
237
+ ]
238
+ },
239
+ "h.6.ln_1.bias": {
240
+ "type": "Distributed",
241
+ "shape": [
242
+ 768
243
+ ],
244
+ "dtype": "F32",
245
+ "chunks": [
246
+ {
247
+ "offsets": [
248
+ 0
249
+ ],
250
+ "shape": [
251
+ 384
252
+ ],
253
+ "filename_index": 0
254
+ },
255
+ {
256
+ "offsets": [
257
+ 384
258
+ ],
259
+ "shape": [
260
+ 384
261
+ ],
262
+ "filename_index": 1
263
+ }
264
+ ]
265
+ },
266
+ "h.7.attn.bias": {
267
+ "type": "Distributed",
268
+ "shape": [
269
+ 1,
270
+ 1,
271
+ 1024,
272
+ 1024
273
+ ],
274
+ "dtype": "F32",
275
+ "chunks": [
276
+ {
277
+ "offsets": [
278
+ 0,
279
+ 0,
280
+ 0,
281
+ 0
282
+ ],
283
+ "shape": [
284
+ 1,
285
+ 1,
286
+ 1024,
287
+ 512
288
+ ],
289
+ "filename_index": 0
290
+ },
291
+ {
292
+ "offsets": [
293
+ 0,
294
+ 0,
295
+ 0,
296
+ 512
297
+ ],
298
+ "shape": [
299
+ 1,
300
+ 1,
301
+ 1024,
302
+ 512
303
+ ],
304
+ "filename_index": 1
305
+ }
306
+ ]
307
+ },
308
+ "h.3.mlp.c_fc.bias": {
309
+ "type": "Distributed",
310
+ "shape": [
311
+ 3072
312
+ ],
313
+ "dtype": "F32",
314
+ "chunks": [
315
+ {
316
+ "offsets": [
317
+ 0
318
+ ],
319
+ "shape": [
320
+ 1536
321
+ ],
322
+ "filename_index": 0
323
+ },
324
+ {
325
+ "offsets": [
326
+ 1536
327
+ ],
328
+ "shape": [
329
+ 1536
330
+ ],
331
+ "filename_index": 1
332
+ }
333
+ ]
334
+ },
335
+ "h.1.mlp.c_proj.bias": {
336
+ "type": "Distributed",
337
+ "shape": [
338
+ 768
339
+ ],
340
+ "dtype": "F32",
341
+ "chunks": [
342
+ {
343
+ "offsets": [
344
+ 0
345
+ ],
346
+ "shape": [
347
+ 384
348
+ ],
349
+ "filename_index": 0
350
+ },
351
+ {
352
+ "offsets": [
353
+ 384
354
+ ],
355
+ "shape": [
356
+ 384
357
+ ],
358
+ "filename_index": 1
359
+ }
360
+ ]
361
+ },
362
+ "h.8.ln_1.weight": {
363
+ "type": "Distributed",
364
+ "shape": [
365
+ 768
366
+ ],
367
+ "dtype": "F32",
368
+ "chunks": [
369
+ {
370
+ "offsets": [
371
+ 0
372
+ ],
373
+ "shape": [
374
+ 384
375
+ ],
376
+ "filename_index": 0
377
+ },
378
+ {
379
+ "offsets": [
380
+ 384
381
+ ],
382
+ "shape": [
383
+ 384
384
+ ],
385
+ "filename_index": 1
386
+ }
387
+ ]
388
+ },
389
+ "h.5.ln_2.weight": {
390
+ "type": "Distributed",
391
+ "shape": [
392
+ 768
393
+ ],
394
+ "dtype": "F32",
395
+ "chunks": [
396
+ {
397
+ "offsets": [
398
+ 0
399
+ ],
400
+ "shape": [
401
+ 384
402
+ ],
403
+ "filename_index": 0
404
+ },
405
+ {
406
+ "offsets": [
407
+ 384
408
+ ],
409
+ "shape": [
410
+ 384
411
+ ],
412
+ "filename_index": 1
413
+ }
414
+ ]
415
+ },
416
+ "h.2.mlp.c_fc.weight": {
417
+ "type": "Distributed",
418
+ "shape": [
419
+ 768,
420
+ 3072
421
+ ],
422
+ "dtype": "F32",
423
+ "chunks": [
424
+ {
425
+ "offsets": [
426
+ 0,
427
+ 0
428
+ ],
429
+ "shape": [
430
+ 768,
431
+ 1536
432
+ ],
433
+ "filename_index": 0
434
+ },
435
+ {
436
+ "offsets": [
437
+ 0,
438
+ 1536
439
+ ],
440
+ "shape": [
441
+ 768,
442
+ 1536
443
+ ],
444
+ "filename_index": 1
445
+ }
446
+ ]
447
+ },
448
+ "h.2.ln_1.weight": {
449
+ "type": "Distributed",
450
+ "shape": [
451
+ 768
452
+ ],
453
+ "dtype": "F32",
454
+ "chunks": [
455
+ {
456
+ "offsets": [
457
+ 0
458
+ ],
459
+ "shape": [
460
+ 384
461
+ ],
462
+ "filename_index": 0
463
+ },
464
+ {
465
+ "offsets": [
466
+ 384
467
+ ],
468
+ "shape": [
469
+ 384
470
+ ],
471
+ "filename_index": 1
472
+ }
473
+ ]
474
+ },
475
+ "h.10.ln_1.weight": {
476
+ "type": "Distributed",
477
+ "shape": [
478
+ 768
479
+ ],
480
+ "dtype": "F32",
481
+ "chunks": [
482
+ {
483
+ "offsets": [
484
+ 0
485
+ ],
486
+ "shape": [
487
+ 384
488
+ ],
489
+ "filename_index": 0
490
+ },
491
+ {
492
+ "offsets": [
493
+ 384
494
+ ],
495
+ "shape": [
496
+ 384
497
+ ],
498
+ "filename_index": 1
499
+ }
500
+ ]
501
+ },
502
+ "h.10.attn.c_attn.bias": {
503
+ "type": "Distributed",
504
+ "shape": [
505
+ 2304
506
+ ],
507
+ "dtype": "F32",
508
+ "chunks": [
509
+ {
510
+ "offsets": [
511
+ 0
512
+ ],
513
+ "shape": [
514
+ 1152
515
+ ],
516
+ "filename_index": 0
517
+ },
518
+ {
519
+ "offsets": [
520
+ 1152
521
+ ],
522
+ "shape": [
523
+ 1152
524
+ ],
525
+ "filename_index": 1
526
+ }
527
+ ]
528
+ },
529
+ "h.8.attn.c_attn.bias": {
530
+ "type": "Distributed",
531
+ "shape": [
532
+ 2304
533
+ ],
534
+ "dtype": "F32",
535
+ "chunks": [
536
+ {
537
+ "offsets": [
538
+ 0
539
+ ],
540
+ "shape": [
541
+ 1152
542
+ ],
543
+ "filename_index": 0
544
+ },
545
+ {
546
+ "offsets": [
547
+ 1152
548
+ ],
549
+ "shape": [
550
+ 1152
551
+ ],
552
+ "filename_index": 1
553
+ }
554
+ ]
555
+ },
556
+ "h.11.attn.c_proj.bias": {
557
+ "type": "Distributed",
558
+ "shape": [
559
+ 768
560
+ ],
561
+ "dtype": "F32",
562
+ "chunks": [
563
+ {
564
+ "offsets": [
565
+ 0
566
+ ],
567
+ "shape": [
568
+ 384
569
+ ],
570
+ "filename_index": 0
571
+ },
572
+ {
573
+ "offsets": [
574
+ 384
575
+ ],
576
+ "shape": [
577
+ 384
578
+ ],
579
+ "filename_index": 1
580
+ }
581
+ ]
582
+ },
583
+ "h.1.attn.c_attn.bias": {
584
+ "type": "Distributed",
585
+ "shape": [
586
+ 2304
587
+ ],
588
+ "dtype": "F32",
589
+ "chunks": [
590
+ {
591
+ "offsets": [
592
+ 0
593
+ ],
594
+ "shape": [
595
+ 1152
596
+ ],
597
+ "filename_index": 0
598
+ },
599
+ {
600
+ "offsets": [
601
+ 1152
602
+ ],
603
+ "shape": [
604
+ 1152
605
+ ],
606
+ "filename_index": 1
607
+ }
608
+ ]
609
+ },
610
+ "h.9.ln_2.weight": {
611
+ "type": "Distributed",
612
+ "shape": [
613
+ 768
614
+ ],
615
+ "dtype": "F32",
616
+ "chunks": [
617
+ {
618
+ "offsets": [
619
+ 0
620
+ ],
621
+ "shape": [
622
+ 384
623
+ ],
624
+ "filename_index": 0
625
+ },
626
+ {
627
+ "offsets": [
628
+ 384
629
+ ],
630
+ "shape": [
631
+ 384
632
+ ],
633
+ "filename_index": 1
634
+ }
635
+ ]
636
+ },
637
+ "h.11.ln_1.weight": {
638
+ "type": "Distributed",
639
+ "shape": [
640
+ 768
641
+ ],
642
+ "dtype": "F32",
643
+ "chunks": [
644
+ {
645
+ "offsets": [
646
+ 0
647
+ ],
648
+ "shape": [
649
+ 384
650
+ ],
651
+ "filename_index": 0
652
+ },
653
+ {
654
+ "offsets": [
655
+ 384
656
+ ],
657
+ "shape": [
658
+ 384
659
+ ],
660
+ "filename_index": 1
661
+ }
662
+ ]
663
+ },
664
+ "h.4.ln_1.bias": {
665
+ "type": "Distributed",
666
+ "shape": [
667
+ 768
668
+ ],
669
+ "dtype": "F32",
670
+ "chunks": [
671
+ {
672
+ "offsets": [
673
+ 0
674
+ ],
675
+ "shape": [
676
+ 384
677
+ ],
678
+ "filename_index": 0
679
+ },
680
+ {
681
+ "offsets": [
682
+ 384
683
+ ],
684
+ "shape": [
685
+ 384
686
+ ],
687
+ "filename_index": 1
688
+ }
689
+ ]
690
+ },
691
+ "h.9.attn.c_attn.weight": {
692
+ "type": "Distributed",
693
+ "shape": [
694
+ 768,
695
+ 2304
696
+ ],
697
+ "dtype": "F32",
698
+ "chunks": [
699
+ {
700
+ "offsets": [
701
+ 0,
702
+ 0
703
+ ],
704
+ "shape": [
705
+ 768,
706
+ 1152
707
+ ],
708
+ "filename_index": 0
709
+ },
710
+ {
711
+ "offsets": [
712
+ 0,
713
+ 1152
714
+ ],
715
+ "shape": [
716
+ 768,
717
+ 1152
718
+ ],
719
+ "filename_index": 1
720
+ }
721
+ ]
722
+ },
723
+ "h.1.attn.c_attn.weight": {
724
+ "type": "Distributed",
725
+ "shape": [
726
+ 768,
727
+ 2304
728
+ ],
729
+ "dtype": "F32",
730
+ "chunks": [
731
+ {
732
+ "offsets": [
733
+ 0,
734
+ 0
735
+ ],
736
+ "shape": [
737
+ 768,
738
+ 1152
739
+ ],
740
+ "filename_index": 0
741
+ },
742
+ {
743
+ "offsets": [
744
+ 0,
745
+ 1152
746
+ ],
747
+ "shape": [
748
+ 768,
749
+ 1152
750
+ ],
751
+ "filename_index": 1
752
+ }
753
+ ]
754
+ },
755
+ "h.9.mlp.c_proj.bias": {
756
+ "type": "Distributed",
757
+ "shape": [
758
+ 768
759
+ ],
760
+ "dtype": "F32",
761
+ "chunks": [
762
+ {
763
+ "offsets": [
764
+ 0
765
+ ],
766
+ "shape": [
767
+ 384
768
+ ],
769
+ "filename_index": 0
770
+ },
771
+ {
772
+ "offsets": [
773
+ 384
774
+ ],
775
+ "shape": [
776
+ 384
777
+ ],
778
+ "filename_index": 1
779
+ }
780
+ ]
781
+ },
782
+ "h.1.attn.c_proj.weight": {
783
+ "type": "Distributed",
784
+ "shape": [
785
+ 768,
786
+ 768
787
+ ],
788
+ "dtype": "F32",
789
+ "chunks": [
790
+ {
791
+ "offsets": [
792
+ 0,
793
+ 0
794
+ ],
795
+ "shape": [
796
+ 384,
797
+ 768
798
+ ],
799
+ "filename_index": 0
800
+ },
801
+ {
802
+ "offsets": [
803
+ 384,
804
+ 0
805
+ ],
806
+ "shape": [
807
+ 384,
808
+ 768
809
+ ],
810
+ "filename_index": 1
811
+ }
812
+ ]
813
+ },
814
+ "h.9.attn.c_proj.weight": {
815
+ "type": "Distributed",
816
+ "shape": [
817
+ 768,
818
+ 768
819
+ ],
820
+ "dtype": "F32",
821
+ "chunks": [
822
+ {
823
+ "offsets": [
824
+ 0,
825
+ 0
826
+ ],
827
+ "shape": [
828
+ 384,
829
+ 768
830
+ ],
831
+ "filename_index": 0
832
+ },
833
+ {
834
+ "offsets": [
835
+ 384,
836
+ 0
837
+ ],
838
+ "shape": [
839
+ 384,
840
+ 768
841
+ ],
842
+ "filename_index": 1
843
+ }
844
+ ]
845
+ },
846
+ "h.0.attn.c_proj.weight": {
847
+ "type": "Distributed",
848
+ "shape": [
849
+ 768,
850
+ 768
851
+ ],
852
+ "dtype": "F32",
853
+ "chunks": [
854
+ {
855
+ "offsets": [
856
+ 0,
857
+ 0
858
+ ],
859
+ "shape": [
860
+ 384,
861
+ 768
862
+ ],
863
+ "filename_index": 0
864
+ },
865
+ {
866
+ "offsets": [
867
+ 384,
868
+ 0
869
+ ],
870
+ "shape": [
871
+ 384,
872
+ 768
873
+ ],
874
+ "filename_index": 1
875
+ }
876
+ ]
877
+ },
878
+ "h.10.ln_2.bias": {
879
+ "type": "Distributed",
880
+ "shape": [
881
+ 768
882
+ ],
883
+ "dtype": "F32",
884
+ "chunks": [
885
+ {
886
+ "offsets": [
887
+ 0
888
+ ],
889
+ "shape": [
890
+ 384
891
+ ],
892
+ "filename_index": 0
893
+ },
894
+ {
895
+ "offsets": [
896
+ 384
897
+ ],
898
+ "shape": [
899
+ 384
900
+ ],
901
+ "filename_index": 1
902
+ }
903
+ ]
904
+ },
905
+ "h.1.mlp.c_proj.weight": {
906
+ "type": "Distributed",
907
+ "shape": [
908
+ 3072,
909
+ 768
910
+ ],
911
+ "dtype": "F32",
912
+ "chunks": [
913
+ {
914
+ "offsets": [
915
+ 0,
916
+ 0
917
+ ],
918
+ "shape": [
919
+ 1536,
920
+ 768
921
+ ],
922
+ "filename_index": 0
923
+ },
924
+ {
925
+ "offsets": [
926
+ 1536,
927
+ 0
928
+ ],
929
+ "shape": [
930
+ 1536,
931
+ 768
932
+ ],
933
+ "filename_index": 1
934
+ }
935
+ ]
936
+ },
937
+ "h.9.ln_2.bias": {
938
+ "type": "Distributed",
939
+ "shape": [
940
+ 768
941
+ ],
942
+ "dtype": "F32",
943
+ "chunks": [
944
+ {
945
+ "offsets": [
946
+ 0
947
+ ],
948
+ "shape": [
949
+ 384
950
+ ],
951
+ "filename_index": 0
952
+ },
953
+ {
954
+ "offsets": [
955
+ 384
956
+ ],
957
+ "shape": [
958
+ 384
959
+ ],
960
+ "filename_index": 1
961
+ }
962
+ ]
963
+ },
964
+ "h.5.attn.c_proj.bias": {
965
+ "type": "Distributed",
966
+ "shape": [
967
+ 768
968
+ ],
969
+ "dtype": "F32",
970
+ "chunks": [
971
+ {
972
+ "offsets": [
973
+ 0
974
+ ],
975
+ "shape": [
976
+ 384
977
+ ],
978
+ "filename_index": 0
979
+ },
980
+ {
981
+ "offsets": [
982
+ 384
983
+ ],
984
+ "shape": [
985
+ 384
986
+ ],
987
+ "filename_index": 1
988
+ }
989
+ ]
990
+ },
991
+ "h.0.ln_2.bias": {
992
+ "type": "Distributed",
993
+ "shape": [
994
+ 768
995
+ ],
996
+ "dtype": "F32",
997
+ "chunks": [
998
+ {
999
+ "offsets": [
1000
+ 0
1001
+ ],
1002
+ "shape": [
1003
+ 384
1004
+ ],
1005
+ "filename_index": 0
1006
+ },
1007
+ {
1008
+ "offsets": [
1009
+ 384
1010
+ ],
1011
+ "shape": [
1012
+ 384
1013
+ ],
1014
+ "filename_index": 1
1015
+ }
1016
+ ]
1017
+ },
1018
+ "h.2.attn.bias": {
1019
+ "type": "Distributed",
1020
+ "shape": [
1021
+ 1,
1022
+ 1,
1023
+ 1024,
1024
+ 1024
1025
+ ],
1026
+ "dtype": "F32",
1027
+ "chunks": [
1028
+ {
1029
+ "offsets": [
1030
+ 0,
1031
+ 0,
1032
+ 0,
1033
+ 0
1034
+ ],
1035
+ "shape": [
1036
+ 1,
1037
+ 1,
1038
+ 1024,
1039
+ 512
1040
+ ],
1041
+ "filename_index": 0
1042
+ },
1043
+ {
1044
+ "offsets": [
1045
+ 0,
1046
+ 0,
1047
+ 0,
1048
+ 512
1049
+ ],
1050
+ "shape": [
1051
+ 1,
1052
+ 1,
1053
+ 1024,
1054
+ 512
1055
+ ],
1056
+ "filename_index": 1
1057
+ }
1058
+ ]
1059
+ },
1060
+ "h.1.mlp.c_fc.bias": {
1061
+ "type": "Distributed",
1062
+ "shape": [
1063
+ 3072
1064
+ ],
1065
+ "dtype": "F32",
1066
+ "chunks": [
1067
+ {
1068
+ "offsets": [
1069
+ 0
1070
+ ],
1071
+ "shape": [
1072
+ 1536
1073
+ ],
1074
+ "filename_index": 0
1075
+ },
1076
+ {
1077
+ "offsets": [
1078
+ 1536
1079
+ ],
1080
+ "shape": [
1081
+ 1536
1082
+ ],
1083
+ "filename_index": 1
1084
+ }
1085
+ ]
1086
+ },
1087
+ "h.2.ln_2.bias": {
1088
+ "type": "Distributed",
1089
+ "shape": [
1090
+ 768
1091
+ ],
1092
+ "dtype": "F32",
1093
+ "chunks": [
1094
+ {
1095
+ "offsets": [
1096
+ 0
1097
+ ],
1098
+ "shape": [
1099
+ 384
1100
+ ],
1101
+ "filename_index": 0
1102
+ },
1103
+ {
1104
+ "offsets": [
1105
+ 384
1106
+ ],
1107
+ "shape": [
1108
+ 384
1109
+ ],
1110
+ "filename_index": 1
1111
+ }
1112
+ ]
1113
+ },
1114
+ "h.7.mlp.c_proj.bias": {
1115
+ "type": "Distributed",
1116
+ "shape": [
1117
+ 768
1118
+ ],
1119
+ "dtype": "F32",
1120
+ "chunks": [
1121
+ {
1122
+ "offsets": [
1123
+ 0
1124
+ ],
1125
+ "shape": [
1126
+ 384
1127
+ ],
1128
+ "filename_index": 0
1129
+ },
1130
+ {
1131
+ "offsets": [
1132
+ 384
1133
+ ],
1134
+ "shape": [
1135
+ 384
1136
+ ],
1137
+ "filename_index": 1
1138
+ }
1139
+ ]
1140
+ },
1141
+ "h.9.mlp.c_fc.weight": {
1142
+ "type": "Distributed",
1143
+ "shape": [
1144
+ 768,
1145
+ 3072
1146
+ ],
1147
+ "dtype": "F32",
1148
+ "chunks": [
1149
+ {
1150
+ "offsets": [
1151
+ 0,
1152
+ 0
1153
+ ],
1154
+ "shape": [
1155
+ 768,
1156
+ 1536
1157
+ ],
1158
+ "filename_index": 0
1159
+ },
1160
+ {
1161
+ "offsets": [
1162
+ 0,
1163
+ 1536
1164
+ ],
1165
+ "shape": [
1166
+ 768,
1167
+ 1536
1168
+ ],
1169
+ "filename_index": 1
1170
+ }
1171
+ ]
1172
+ },
1173
+ "h.1.ln_1.bias": {
1174
+ "type": "Distributed",
1175
+ "shape": [
1176
+ 768
1177
+ ],
1178
+ "dtype": "F32",
1179
+ "chunks": [
1180
+ {
1181
+ "offsets": [
1182
+ 0
1183
+ ],
1184
+ "shape": [
1185
+ 384
1186
+ ],
1187
+ "filename_index": 0
1188
+ },
1189
+ {
1190
+ "offsets": [
1191
+ 384
1192
+ ],
1193
+ "shape": [
1194
+ 384
1195
+ ],
1196
+ "filename_index": 1
1197
+ }
1198
+ ]
1199
+ },
1200
+ "h.8.attn.c_proj.weight": {
1201
+ "type": "Distributed",
1202
+ "shape": [
1203
+ 768,
1204
+ 768
1205
+ ],
1206
+ "dtype": "F32",
1207
+ "chunks": [
1208
+ {
1209
+ "offsets": [
1210
+ 0,
1211
+ 0
1212
+ ],
1213
+ "shape": [
1214
+ 384,
1215
+ 768
1216
+ ],
1217
+ "filename_index": 0
1218
+ },
1219
+ {
1220
+ "offsets": [
1221
+ 384,
1222
+ 0
1223
+ ],
1224
+ "shape": [
1225
+ 384,
1226
+ 768
1227
+ ],
1228
+ "filename_index": 1
1229
+ }
1230
+ ]
1231
+ },
1232
+ "h.9.attn.c_attn.bias": {
1233
+ "type": "Distributed",
1234
+ "shape": [
1235
+ 2304
1236
+ ],
1237
+ "dtype": "F32",
1238
+ "chunks": [
1239
+ {
1240
+ "offsets": [
1241
+ 0
1242
+ ],
1243
+ "shape": [
1244
+ 1152
1245
+ ],
1246
+ "filename_index": 0
1247
+ },
1248
+ {
1249
+ "offsets": [
1250
+ 1152
1251
+ ],
1252
+ "shape": [
1253
+ 1152
1254
+ ],
1255
+ "filename_index": 1
1256
+ }
1257
+ ]
1258
+ },
1259
+ "h.10.attn.c_proj.bias": {
1260
+ "type": "Distributed",
1261
+ "shape": [
1262
+ 768
1263
+ ],
1264
+ "dtype": "F32",
1265
+ "chunks": [
1266
+ {
1267
+ "offsets": [
1268
+ 0
1269
+ ],
1270
+ "shape": [
1271
+ 384
1272
+ ],
1273
+ "filename_index": 0
1274
+ },
1275
+ {
1276
+ "offsets": [
1277
+ 384
1278
+ ],
1279
+ "shape": [
1280
+ 384
1281
+ ],
1282
+ "filename_index": 1
1283
+ }
1284
+ ]
1285
+ },
1286
+ "h.3.ln_1.bias": {
1287
+ "type": "Distributed",
1288
+ "shape": [
1289
+ 768
1290
+ ],
1291
+ "dtype": "F32",
1292
+ "chunks": [
1293
+ {
1294
+ "offsets": [
1295
+ 0
1296
+ ],
1297
+ "shape": [
1298
+ 384
1299
+ ],
1300
+ "filename_index": 0
1301
+ },
1302
+ {
1303
+ "offsets": [
1304
+ 384
1305
+ ],
1306
+ "shape": [
1307
+ 384
1308
+ ],
1309
+ "filename_index": 1
1310
+ }
1311
+ ]
1312
+ },
1313
+ "h.7.attn.c_attn.weight": {
1314
+ "type": "Distributed",
1315
+ "shape": [
1316
+ 768,
1317
+ 2304
1318
+ ],
1319
+ "dtype": "F32",
1320
+ "chunks": [
1321
+ {
1322
+ "offsets": [
1323
+ 0,
1324
+ 0
1325
+ ],
1326
+ "shape": [
1327
+ 768,
1328
+ 1152
1329
+ ],
1330
+ "filename_index": 0
1331
+ },
1332
+ {
1333
+ "offsets": [
1334
+ 0,
1335
+ 1152
1336
+ ],
1337
+ "shape": [
1338
+ 768,
1339
+ 1152
1340
+ ],
1341
+ "filename_index": 1
1342
+ }
1343
+ ]
1344
+ },
1345
+ "h.7.mlp.c_proj.weight": {
1346
+ "type": "Distributed",
1347
+ "shape": [
1348
+ 3072,
1349
+ 768
1350
+ ],
1351
+ "dtype": "F32",
1352
+ "chunks": [
1353
+ {
1354
+ "offsets": [
1355
+ 0,
1356
+ 0
1357
+ ],
1358
+ "shape": [
1359
+ 1536,
1360
+ 768
1361
+ ],
1362
+ "filename_index": 0
1363
+ },
1364
+ {
1365
+ "offsets": [
1366
+ 1536,
1367
+ 0
1368
+ ],
1369
+ "shape": [
1370
+ 1536,
1371
+ 768
1372
+ ],
1373
+ "filename_index": 1
1374
+ }
1375
+ ]
1376
+ },
1377
+ "h.1.attn.c_proj.bias": {
1378
+ "type": "Distributed",
1379
+ "shape": [
1380
+ 768
1381
+ ],
1382
+ "dtype": "F32",
1383
+ "chunks": [
1384
+ {
1385
+ "offsets": [
1386
+ 0
1387
+ ],
1388
+ "shape": [
1389
+ 384
1390
+ ],
1391
+ "filename_index": 0
1392
+ },
1393
+ {
1394
+ "offsets": [
1395
+ 384
1396
+ ],
1397
+ "shape": [
1398
+ 384
1399
+ ],
1400
+ "filename_index": 1
1401
+ }
1402
+ ]
1403
+ },
1404
+ "h.11.ln_1.bias": {
1405
+ "type": "Distributed",
1406
+ "shape": [
1407
+ 768
1408
+ ],
1409
+ "dtype": "F32",
1410
+ "chunks": [
1411
+ {
1412
+ "offsets": [
1413
+ 0
1414
+ ],
1415
+ "shape": [
1416
+ 384
1417
+ ],
1418
+ "filename_index": 0
1419
+ },
1420
+ {
1421
+ "offsets": [
1422
+ 384
1423
+ ],
1424
+ "shape": [
1425
+ 384
1426
+ ],
1427
+ "filename_index": 1
1428
+ }
1429
+ ]
1430
+ },
1431
+ "h.9.mlp.c_proj.weight": {
1432
+ "type": "Distributed",
1433
+ "shape": [
1434
+ 3072,
1435
+ 768
1436
+ ],
1437
+ "dtype": "F32",
1438
+ "chunks": [
1439
+ {
1440
+ "offsets": [
1441
+ 0,
1442
+ 0
1443
+ ],
1444
+ "shape": [
1445
+ 1536,
1446
+ 768
1447
+ ],
1448
+ "filename_index": 0
1449
+ },
1450
+ {
1451
+ "offsets": [
1452
+ 1536,
1453
+ 0
1454
+ ],
1455
+ "shape": [
1456
+ 1536,
1457
+ 768
1458
+ ],
1459
+ "filename_index": 1
1460
+ }
1461
+ ]
1462
+ },
1463
+ "h.0.attn.c_proj.bias": {
1464
+ "type": "Distributed",
1465
+ "shape": [
1466
+ 768
1467
+ ],
1468
+ "dtype": "F32",
1469
+ "chunks": [
1470
+ {
1471
+ "offsets": [
1472
+ 0
1473
+ ],
1474
+ "shape": [
1475
+ 384
1476
+ ],
1477
+ "filename_index": 0
1478
+ },
1479
+ {
1480
+ "offsets": [
1481
+ 384
1482
+ ],
1483
+ "shape": [
1484
+ 384
1485
+ ],
1486
+ "filename_index": 1
1487
+ }
1488
+ ]
1489
+ },
1490
+ "h.7.attn.c_attn.bias": {
1491
+ "type": "Distributed",
1492
+ "shape": [
1493
+ 2304
1494
+ ],
1495
+ "dtype": "F32",
1496
+ "chunks": [
1497
+ {
1498
+ "offsets": [
1499
+ 0
1500
+ ],
1501
+ "shape": [
1502
+ 1152
1503
+ ],
1504
+ "filename_index": 0
1505
+ },
1506
+ {
1507
+ "offsets": [
1508
+ 1152
1509
+ ],
1510
+ "shape": [
1511
+ 1152
1512
+ ],
1513
+ "filename_index": 1
1514
+ }
1515
+ ]
1516
+ },
1517
+ "h.10.ln_1.bias": {
1518
+ "type": "Distributed",
1519
+ "shape": [
1520
+ 768
1521
+ ],
1522
+ "dtype": "F32",
1523
+ "chunks": [
1524
+ {
1525
+ "offsets": [
1526
+ 0
1527
+ ],
1528
+ "shape": [
1529
+ 384
1530
+ ],
1531
+ "filename_index": 0
1532
+ },
1533
+ {
1534
+ "offsets": [
1535
+ 384
1536
+ ],
1537
+ "shape": [
1538
+ 384
1539
+ ],
1540
+ "filename_index": 1
1541
+ }
1542
+ ]
1543
+ },
1544
+ "h.8.attn.c_proj.bias": {
1545
+ "type": "Distributed",
1546
+ "shape": [
1547
+ 768
1548
+ ],
1549
+ "dtype": "F32",
1550
+ "chunks": [
1551
+ {
1552
+ "offsets": [
1553
+ 0
1554
+ ],
1555
+ "shape": [
1556
+ 384
1557
+ ],
1558
+ "filename_index": 0
1559
+ },
1560
+ {
1561
+ "offsets": [
1562
+ 384
1563
+ ],
1564
+ "shape": [
1565
+ 384
1566
+ ],
1567
+ "filename_index": 1
1568
+ }
1569
+ ]
1570
+ },
1571
+ "h.2.attn.c_attn.weight": {
1572
+ "type": "Distributed",
1573
+ "shape": [
1574
+ 768,
1575
+ 2304
1576
+ ],
1577
+ "dtype": "F32",
1578
+ "chunks": [
1579
+ {
1580
+ "offsets": [
1581
+ 0,
1582
+ 0
1583
+ ],
1584
+ "shape": [
1585
+ 768,
1586
+ 1152
1587
+ ],
1588
+ "filename_index": 0
1589
+ },
1590
+ {
1591
+ "offsets": [
1592
+ 0,
1593
+ 1152
1594
+ ],
1595
+ "shape": [
1596
+ 768,
1597
+ 1152
1598
+ ],
1599
+ "filename_index": 1
1600
+ }
1601
+ ]
1602
+ },
1603
+ "h.6.ln_2.weight": {
1604
+ "type": "Distributed",
1605
+ "shape": [
1606
+ 768
1607
+ ],
1608
+ "dtype": "F32",
1609
+ "chunks": [
1610
+ {
1611
+ "offsets": [
1612
+ 0
1613
+ ],
1614
+ "shape": [
1615
+ 384
1616
+ ],
1617
+ "filename_index": 0
1618
+ },
1619
+ {
1620
+ "offsets": [
1621
+ 384
1622
+ ],
1623
+ "shape": [
1624
+ 384
1625
+ ],
1626
+ "filename_index": 1
1627
+ }
1628
+ ]
1629
+ },
1630
+ "h.6.ln_2.bias": {
1631
+ "type": "Distributed",
1632
+ "shape": [
1633
+ 768
1634
+ ],
1635
+ "dtype": "F32",
1636
+ "chunks": [
1637
+ {
1638
+ "offsets": [
1639
+ 0
1640
+ ],
1641
+ "shape": [
1642
+ 384
1643
+ ],
1644
+ "filename_index": 0
1645
+ },
1646
+ {
1647
+ "offsets": [
1648
+ 384
1649
+ ],
1650
+ "shape": [
1651
+ 384
1652
+ ],
1653
+ "filename_index": 1
1654
+ }
1655
+ ]
1656
+ },
1657
+ "wpe.weight": {
1658
+ "type": "Distributed",
1659
+ "shape": [
1660
+ 1024,
1661
+ 768
1662
+ ],
1663
+ "dtype": "F32",
1664
+ "chunks": [
1665
+ {
1666
+ "offsets": [
1667
+ 0,
1668
+ 0
1669
+ ],
1670
+ "shape": [
1671
+ 1024,
1672
+ 384
1673
+ ],
1674
+ "filename_index": 0
1675
+ },
1676
+ {
1677
+ "offsets": [
1678
+ 0,
1679
+ 384
1680
+ ],
1681
+ "shape": [
1682
+ 1024,
1683
+ 384
1684
+ ],
1685
+ "filename_index": 1
1686
+ }
1687
+ ]
1688
+ },
1689
+ "h.3.mlp.c_fc.weight": {
1690
+ "type": "Distributed",
1691
+ "shape": [
1692
+ 768,
1693
+ 3072
1694
+ ],
1695
+ "dtype": "F32",
1696
+ "chunks": [
1697
+ {
1698
+ "offsets": [
1699
+ 0,
1700
+ 0
1701
+ ],
1702
+ "shape": [
1703
+ 768,
1704
+ 1536
1705
+ ],
1706
+ "filename_index": 0
1707
+ },
1708
+ {
1709
+ "offsets": [
1710
+ 0,
1711
+ 1536
1712
+ ],
1713
+ "shape": [
1714
+ 768,
1715
+ 1536
1716
+ ],
1717
+ "filename_index": 1
1718
+ }
1719
+ ]
1720
+ },
1721
+ "h.4.attn.c_proj.weight": {
1722
+ "type": "Distributed",
1723
+ "shape": [
1724
+ 768,
1725
+ 768
1726
+ ],
1727
+ "dtype": "F32",
1728
+ "chunks": [
1729
+ {
1730
+ "offsets": [
1731
+ 0,
1732
+ 0
1733
+ ],
1734
+ "shape": [
1735
+ 384,
1736
+ 768
1737
+ ],
1738
+ "filename_index": 0
1739
+ },
1740
+ {
1741
+ "offsets": [
1742
+ 384,
1743
+ 0
1744
+ ],
1745
+ "shape": [
1746
+ 384,
1747
+ 768
1748
+ ],
1749
+ "filename_index": 1
1750
+ }
1751
+ ]
1752
+ },
1753
+ "h.6.attn.bias": {
1754
+ "type": "Distributed",
1755
+ "shape": [
1756
+ 1,
1757
+ 1,
1758
+ 1024,
1759
+ 1024
1760
+ ],
1761
+ "dtype": "F32",
1762
+ "chunks": [
1763
+ {
1764
+ "offsets": [
1765
+ 0,
1766
+ 0,
1767
+ 0,
1768
+ 0
1769
+ ],
1770
+ "shape": [
1771
+ 1,
1772
+ 1,
1773
+ 1024,
1774
+ 512
1775
+ ],
1776
+ "filename_index": 0
1777
+ },
1778
+ {
1779
+ "offsets": [
1780
+ 0,
1781
+ 0,
1782
+ 0,
1783
+ 512
1784
+ ],
1785
+ "shape": [
1786
+ 1,
1787
+ 1,
1788
+ 1024,
1789
+ 512
1790
+ ],
1791
+ "filename_index": 1
1792
+ }
1793
+ ]
1794
+ },
1795
+ "h.7.ln_2.weight": {
1796
+ "type": "Distributed",
1797
+ "shape": [
1798
+ 768
1799
+ ],
1800
+ "dtype": "F32",
1801
+ "chunks": [
1802
+ {
1803
+ "offsets": [
1804
+ 0
1805
+ ],
1806
+ "shape": [
1807
+ 384
1808
+ ],
1809
+ "filename_index": 0
1810
+ },
1811
+ {
1812
+ "offsets": [
1813
+ 384
1814
+ ],
1815
+ "shape": [
1816
+ 384
1817
+ ],
1818
+ "filename_index": 1
1819
+ }
1820
+ ]
1821
+ },
1822
+ "h.11.mlp.c_proj.bias": {
1823
+ "type": "Distributed",
1824
+ "shape": [
1825
+ 768
1826
+ ],
1827
+ "dtype": "F32",
1828
+ "chunks": [
1829
+ {
1830
+ "offsets": [
1831
+ 0
1832
+ ],
1833
+ "shape": [
1834
+ 384
1835
+ ],
1836
+ "filename_index": 0
1837
+ },
1838
+ {
1839
+ "offsets": [
1840
+ 384
1841
+ ],
1842
+ "shape": [
1843
+ 384
1844
+ ],
1845
+ "filename_index": 1
1846
+ }
1847
+ ]
1848
+ },
1849
+ "h.10.attn.bias": {
1850
+ "type": "Distributed",
1851
+ "shape": [
1852
+ 1,
1853
+ 1,
1854
+ 1024,
1855
+ 1024
1856
+ ],
1857
+ "dtype": "F32",
1858
+ "chunks": [
1859
+ {
1860
+ "offsets": [
1861
+ 0,
1862
+ 0,
1863
+ 0,
1864
+ 0
1865
+ ],
1866
+ "shape": [
1867
+ 1,
1868
+ 1,
1869
+ 1024,
1870
+ 512
1871
+ ],
1872
+ "filename_index": 0
1873
+ },
1874
+ {
1875
+ "offsets": [
1876
+ 0,
1877
+ 0,
1878
+ 0,
1879
+ 512
1880
+ ],
1881
+ "shape": [
1882
+ 1,
1883
+ 1,
1884
+ 1024,
1885
+ 512
1886
+ ],
1887
+ "filename_index": 1
1888
+ }
1889
+ ]
1890
+ },
1891
+ "h.5.attn.bias": {
1892
+ "type": "Distributed",
1893
+ "shape": [
1894
+ 1,
1895
+ 1,
1896
+ 1024,
1897
+ 1024
1898
+ ],
1899
+ "dtype": "F32",
1900
+ "chunks": [
1901
+ {
1902
+ "offsets": [
1903
+ 0,
1904
+ 0,
1905
+ 0,
1906
+ 0
1907
+ ],
1908
+ "shape": [
1909
+ 1,
1910
+ 1,
1911
+ 1024,
1912
+ 512
1913
+ ],
1914
+ "filename_index": 0
1915
+ },
1916
+ {
1917
+ "offsets": [
1918
+ 0,
1919
+ 0,
1920
+ 0,
1921
+ 512
1922
+ ],
1923
+ "shape": [
1924
+ 1,
1925
+ 1,
1926
+ 1024,
1927
+ 512
1928
+ ],
1929
+ "filename_index": 1
1930
+ }
1931
+ ]
1932
+ },
1933
+ "ln_f.bias": {
1934
+ "type": "Distributed",
1935
+ "shape": [
1936
+ 768
1937
+ ],
1938
+ "dtype": "F32",
1939
+ "chunks": [
1940
+ {
1941
+ "offsets": [
1942
+ 0
1943
+ ],
1944
+ "shape": [
1945
+ 384
1946
+ ],
1947
+ "filename_index": 0
1948
+ },
1949
+ {
1950
+ "offsets": [
1951
+ 384
1952
+ ],
1953
+ "shape": [
1954
+ 384
1955
+ ],
1956
+ "filename_index": 1
1957
+ }
1958
+ ]
1959
+ },
1960
+ "h.3.attn.bias": {
1961
+ "type": "Distributed",
1962
+ "shape": [
1963
+ 1,
1964
+ 1,
1965
+ 1024,
1966
+ 1024
1967
+ ],
1968
+ "dtype": "F32",
1969
+ "chunks": [
1970
+ {
1971
+ "offsets": [
1972
+ 0,
1973
+ 0,
1974
+ 0,
1975
+ 0
1976
+ ],
1977
+ "shape": [
1978
+ 1,
1979
+ 1,
1980
+ 1024,
1981
+ 512
1982
+ ],
1983
+ "filename_index": 0
1984
+ },
1985
+ {
1986
+ "offsets": [
1987
+ 0,
1988
+ 0,
1989
+ 0,
1990
+ 512
1991
+ ],
1992
+ "shape": [
1993
+ 1,
1994
+ 1,
1995
+ 1024,
1996
+ 512
1997
+ ],
1998
+ "filename_index": 1
1999
+ }
2000
+ ]
2001
+ },
2002
+ "h.5.ln_1.weight": {
2003
+ "type": "Distributed",
2004
+ "shape": [
2005
+ 768
2006
+ ],
2007
+ "dtype": "F32",
2008
+ "chunks": [
2009
+ {
2010
+ "offsets": [
2011
+ 0
2012
+ ],
2013
+ "shape": [
2014
+ 384
2015
+ ],
2016
+ "filename_index": 0
2017
+ },
2018
+ {
2019
+ "offsets": [
2020
+ 384
2021
+ ],
2022
+ "shape": [
2023
+ 384
2024
+ ],
2025
+ "filename_index": 1
2026
+ }
2027
+ ]
2028
+ },
2029
+ "h.10.mlp.c_fc.bias": {
2030
+ "type": "Distributed",
2031
+ "shape": [
2032
+ 3072
2033
+ ],
2034
+ "dtype": "F32",
2035
+ "chunks": [
2036
+ {
2037
+ "offsets": [
2038
+ 0
2039
+ ],
2040
+ "shape": [
2041
+ 1536
2042
+ ],
2043
+ "filename_index": 0
2044
+ },
2045
+ {
2046
+ "offsets": [
2047
+ 1536
2048
+ ],
2049
+ "shape": [
2050
+ 1536
2051
+ ],
2052
+ "filename_index": 1
2053
+ }
2054
+ ]
2055
+ },
2056
+ "h.6.mlp.c_proj.bias": {
2057
+ "type": "Distributed",
2058
+ "shape": [
2059
+ 768
2060
+ ],
2061
+ "dtype": "F32",
2062
+ "chunks": [
2063
+ {
2064
+ "offsets": [
2065
+ 0
2066
+ ],
2067
+ "shape": [
2068
+ 384
2069
+ ],
2070
+ "filename_index": 0
2071
+ },
2072
+ {
2073
+ "offsets": [
2074
+ 384
2075
+ ],
2076
+ "shape": [
2077
+ 384
2078
+ ],
2079
+ "filename_index": 1
2080
+ }
2081
+ ]
2082
+ },
2083
+ "h.6.ln_1.weight": {
2084
+ "type": "Distributed",
2085
+ "shape": [
2086
+ 768
2087
+ ],
2088
+ "dtype": "F32",
2089
+ "chunks": [
2090
+ {
2091
+ "offsets": [
2092
+ 0
2093
+ ],
2094
+ "shape": [
2095
+ 384
2096
+ ],
2097
+ "filename_index": 0
2098
+ },
2099
+ {
2100
+ "offsets": [
2101
+ 384
2102
+ ],
2103
+ "shape": [
2104
+ 384
2105
+ ],
2106
+ "filename_index": 1
2107
+ }
2108
+ ]
2109
+ },
2110
+ "h.7.ln_1.weight": {
2111
+ "type": "Distributed",
2112
+ "shape": [
2113
+ 768
2114
+ ],
2115
+ "dtype": "F32",
2116
+ "chunks": [
2117
+ {
2118
+ "offsets": [
2119
+ 0
2120
+ ],
2121
+ "shape": [
2122
+ 384
2123
+ ],
2124
+ "filename_index": 0
2125
+ },
2126
+ {
2127
+ "offsets": [
2128
+ 384
2129
+ ],
2130
+ "shape": [
2131
+ 384
2132
+ ],
2133
+ "filename_index": 1
2134
+ }
2135
+ ]
2136
+ },
2137
+ "h.11.attn.c_attn.bias": {
2138
+ "type": "Distributed",
2139
+ "shape": [
2140
+ 2304
2141
+ ],
2142
+ "dtype": "F32",
2143
+ "chunks": [
2144
+ {
2145
+ "offsets": [
2146
+ 0
2147
+ ],
2148
+ "shape": [
2149
+ 1152
2150
+ ],
2151
+ "filename_index": 0
2152
+ },
2153
+ {
2154
+ "offsets": [
2155
+ 1152
2156
+ ],
2157
+ "shape": [
2158
+ 1152
2159
+ ],
2160
+ "filename_index": 1
2161
+ }
2162
+ ]
2163
+ },
2164
+ "h.6.attn.c_proj.bias": {
2165
+ "type": "Distributed",
2166
+ "shape": [
2167
+ 768
2168
+ ],
2169
+ "dtype": "F32",
2170
+ "chunks": [
2171
+ {
2172
+ "offsets": [
2173
+ 0
2174
+ ],
2175
+ "shape": [
2176
+ 384
2177
+ ],
2178
+ "filename_index": 0
2179
+ },
2180
+ {
2181
+ "offsets": [
2182
+ 384
2183
+ ],
2184
+ "shape": [
2185
+ 384
2186
+ ],
2187
+ "filename_index": 1
2188
+ }
2189
+ ]
2190
+ },
2191
+ "h.9.attn.c_proj.bias": {
2192
+ "type": "Distributed",
2193
+ "shape": [
2194
+ 768
2195
+ ],
2196
+ "dtype": "F32",
2197
+ "chunks": [
2198
+ {
2199
+ "offsets": [
2200
+ 0
2201
+ ],
2202
+ "shape": [
2203
+ 384
2204
+ ],
2205
+ "filename_index": 0
2206
+ },
2207
+ {
2208
+ "offsets": [
2209
+ 384
2210
+ ],
2211
+ "shape": [
2212
+ 384
2213
+ ],
2214
+ "filename_index": 1
2215
+ }
2216
+ ]
2217
+ },
2218
+ "h.4.ln_2.bias": {
2219
+ "type": "Distributed",
2220
+ "shape": [
2221
+ 768
2222
+ ],
2223
+ "dtype": "F32",
2224
+ "chunks": [
2225
+ {
2226
+ "offsets": [
2227
+ 0
2228
+ ],
2229
+ "shape": [
2230
+ 384
2231
+ ],
2232
+ "filename_index": 0
2233
+ },
2234
+ {
2235
+ "offsets": [
2236
+ 384
2237
+ ],
2238
+ "shape": [
2239
+ 384
2240
+ ],
2241
+ "filename_index": 1
2242
+ }
2243
+ ]
2244
+ },
2245
+ "h.2.mlp.c_proj.bias": {
2246
+ "type": "Distributed",
2247
+ "shape": [
2248
+ 768
2249
+ ],
2250
+ "dtype": "F32",
2251
+ "chunks": [
2252
+ {
2253
+ "offsets": [
2254
+ 0
2255
+ ],
2256
+ "shape": [
2257
+ 384
2258
+ ],
2259
+ "filename_index": 0
2260
+ },
2261
+ {
2262
+ "offsets": [
2263
+ 384
2264
+ ],
2265
+ "shape": [
2266
+ 384
2267
+ ],
2268
+ "filename_index": 1
2269
+ }
2270
+ ]
2271
+ },
2272
+ "h.4.mlp.c_fc.bias": {
2273
+ "type": "Distributed",
2274
+ "shape": [
2275
+ 3072
2276
+ ],
2277
+ "dtype": "F32",
2278
+ "chunks": [
2279
+ {
2280
+ "offsets": [
2281
+ 0
2282
+ ],
2283
+ "shape": [
2284
+ 1536
2285
+ ],
2286
+ "filename_index": 0
2287
+ },
2288
+ {
2289
+ "offsets": [
2290
+ 1536
2291
+ ],
2292
+ "shape": [
2293
+ 1536
2294
+ ],
2295
+ "filename_index": 1
2296
+ }
2297
+ ]
2298
+ },
2299
+ "h.5.attn.c_attn.bias": {
2300
+ "type": "Distributed",
2301
+ "shape": [
2302
+ 2304
2303
+ ],
2304
+ "dtype": "F32",
2305
+ "chunks": [
2306
+ {
2307
+ "offsets": [
2308
+ 0
2309
+ ],
2310
+ "shape": [
2311
+ 1152
2312
+ ],
2313
+ "filename_index": 0
2314
+ },
2315
+ {
2316
+ "offsets": [
2317
+ 1152
2318
+ ],
2319
+ "shape": [
2320
+ 1152
2321
+ ],
2322
+ "filename_index": 1
2323
+ }
2324
+ ]
2325
+ },
2326
+ "h.0.ln_1.weight": {
2327
+ "type": "Distributed",
2328
+ "shape": [
2329
+ 768
2330
+ ],
2331
+ "dtype": "F32",
2332
+ "chunks": [
2333
+ {
2334
+ "offsets": [
2335
+ 0
2336
+ ],
2337
+ "shape": [
2338
+ 384
2339
+ ],
2340
+ "filename_index": 0
2341
+ },
2342
+ {
2343
+ "offsets": [
2344
+ 384
2345
+ ],
2346
+ "shape": [
2347
+ 384
2348
+ ],
2349
+ "filename_index": 1
2350
+ }
2351
+ ]
2352
+ },
2353
+ "h.11.attn.c_proj.weight": {
2354
+ "type": "Distributed",
2355
+ "shape": [
2356
+ 768,
2357
+ 768
2358
+ ],
2359
+ "dtype": "F32",
2360
+ "chunks": [
2361
+ {
2362
+ "offsets": [
2363
+ 0,
2364
+ 0
2365
+ ],
2366
+ "shape": [
2367
+ 384,
2368
+ 768
2369
+ ],
2370
+ "filename_index": 0
2371
+ },
2372
+ {
2373
+ "offsets": [
2374
+ 384,
2375
+ 0
2376
+ ],
2377
+ "shape": [
2378
+ 384,
2379
+ 768
2380
+ ],
2381
+ "filename_index": 1
2382
+ }
2383
+ ]
2384
+ },
2385
+ "h.10.attn.c_proj.weight": {
2386
+ "type": "Distributed",
2387
+ "shape": [
2388
+ 768,
2389
+ 768
2390
+ ],
2391
+ "dtype": "F32",
2392
+ "chunks": [
2393
+ {
2394
+ "offsets": [
2395
+ 0,
2396
+ 0
2397
+ ],
2398
+ "shape": [
2399
+ 384,
2400
+ 768
2401
+ ],
2402
+ "filename_index": 0
2403
+ },
2404
+ {
2405
+ "offsets": [
2406
+ 384,
2407
+ 0
2408
+ ],
2409
+ "shape": [
2410
+ 384,
2411
+ 768
2412
+ ],
2413
+ "filename_index": 1
2414
+ }
2415
+ ]
2416
+ },
2417
+ "h.0.attn.bias": {
2418
+ "type": "Distributed",
2419
+ "shape": [
2420
+ 1,
2421
+ 1,
2422
+ 1024,
2423
+ 1024
2424
+ ],
2425
+ "dtype": "F32",
2426
+ "chunks": [
2427
+ {
2428
+ "offsets": [
2429
+ 0,
2430
+ 0,
2431
+ 0,
2432
+ 0
2433
+ ],
2434
+ "shape": [
2435
+ 1,
2436
+ 1,
2437
+ 1024,
2438
+ 512
2439
+ ],
2440
+ "filename_index": 0
2441
+ },
2442
+ {
2443
+ "offsets": [
2444
+ 0,
2445
+ 0,
2446
+ 0,
2447
+ 512
2448
+ ],
2449
+ "shape": [
2450
+ 1,
2451
+ 1,
2452
+ 1024,
2453
+ 512
2454
+ ],
2455
+ "filename_index": 1
2456
+ }
2457
+ ]
2458
+ },
2459
+ "h.4.mlp.c_proj.bias": {
2460
+ "type": "Distributed",
2461
+ "shape": [
2462
+ 768
2463
+ ],
2464
+ "dtype": "F32",
2465
+ "chunks": [
2466
+ {
2467
+ "offsets": [
2468
+ 0
2469
+ ],
2470
+ "shape": [
2471
+ 384
2472
+ ],
2473
+ "filename_index": 0
2474
+ },
2475
+ {
2476
+ "offsets": [
2477
+ 384
2478
+ ],
2479
+ "shape": [
2480
+ 384
2481
+ ],
2482
+ "filename_index": 1
2483
+ }
2484
+ ]
2485
+ },
2486
+ "h.9.ln_1.weight": {
2487
+ "type": "Distributed",
2488
+ "shape": [
2489
+ 768
2490
+ ],
2491
+ "dtype": "F32",
2492
+ "chunks": [
2493
+ {
2494
+ "offsets": [
2495
+ 0
2496
+ ],
2497
+ "shape": [
2498
+ 384
2499
+ ],
2500
+ "filename_index": 0
2501
+ },
2502
+ {
2503
+ "offsets": [
2504
+ 384
2505
+ ],
2506
+ "shape": [
2507
+ 384
2508
+ ],
2509
+ "filename_index": 1
2510
+ }
2511
+ ]
2512
+ },
2513
+ "h.2.ln_1.bias": {
2514
+ "type": "Distributed",
2515
+ "shape": [
2516
+ 768
2517
+ ],
2518
+ "dtype": "F32",
2519
+ "chunks": [
2520
+ {
2521
+ "offsets": [
2522
+ 0
2523
+ ],
2524
+ "shape": [
2525
+ 384
2526
+ ],
2527
+ "filename_index": 0
2528
+ },
2529
+ {
2530
+ "offsets": [
2531
+ 384
2532
+ ],
2533
+ "shape": [
2534
+ 384
2535
+ ],
2536
+ "filename_index": 1
2537
+ }
2538
+ ]
2539
+ },
2540
+ "h.8.mlp.c_proj.bias": {
2541
+ "type": "Distributed",
2542
+ "shape": [
2543
+ 768
2544
+ ],
2545
+ "dtype": "F32",
2546
+ "chunks": [
2547
+ {
2548
+ "offsets": [
2549
+ 0
2550
+ ],
2551
+ "shape": [
2552
+ 384
2553
+ ],
2554
+ "filename_index": 0
2555
+ },
2556
+ {
2557
+ "offsets": [
2558
+ 384
2559
+ ],
2560
+ "shape": [
2561
+ 384
2562
+ ],
2563
+ "filename_index": 1
2564
+ }
2565
+ ]
2566
+ },
2567
+ "h.3.ln_2.weight": {
2568
+ "type": "Distributed",
2569
+ "shape": [
2570
+ 768
2571
+ ],
2572
+ "dtype": "F32",
2573
+ "chunks": [
2574
+ {
2575
+ "offsets": [
2576
+ 0
2577
+ ],
2578
+ "shape": [
2579
+ 384
2580
+ ],
2581
+ "filename_index": 0
2582
+ },
2583
+ {
2584
+ "offsets": [
2585
+ 384
2586
+ ],
2587
+ "shape": [
2588
+ 384
2589
+ ],
2590
+ "filename_index": 1
2591
+ }
2592
+ ]
2593
+ },
2594
+ "h.0.mlp.c_proj.bias": {
2595
+ "type": "Distributed",
2596
+ "shape": [
2597
+ 768
2598
+ ],
2599
+ "dtype": "F32",
2600
+ "chunks": [
2601
+ {
2602
+ "offsets": [
2603
+ 0
2604
+ ],
2605
+ "shape": [
2606
+ 384
2607
+ ],
2608
+ "filename_index": 0
2609
+ },
2610
+ {
2611
+ "offsets": [
2612
+ 384
2613
+ ],
2614
+ "shape": [
2615
+ 384
2616
+ ],
2617
+ "filename_index": 1
2618
+ }
2619
+ ]
2620
+ },
2621
+ "h.7.ln_2.bias": {
2622
+ "type": "Distributed",
2623
+ "shape": [
2624
+ 768
2625
+ ],
2626
+ "dtype": "F32",
2627
+ "chunks": [
2628
+ {
2629
+ "offsets": [
2630
+ 0
2631
+ ],
2632
+ "shape": [
2633
+ 384
2634
+ ],
2635
+ "filename_index": 0
2636
+ },
2637
+ {
2638
+ "offsets": [
2639
+ 384
2640
+ ],
2641
+ "shape": [
2642
+ 384
2643
+ ],
2644
+ "filename_index": 1
2645
+ }
2646
+ ]
2647
+ },
2648
+ "h.8.ln_2.bias": {
2649
+ "type": "Distributed",
2650
+ "shape": [
2651
+ 768
2652
+ ],
2653
+ "dtype": "F32",
2654
+ "chunks": [
2655
+ {
2656
+ "offsets": [
2657
+ 0
2658
+ ],
2659
+ "shape": [
2660
+ 384
2661
+ ],
2662
+ "filename_index": 0
2663
+ },
2664
+ {
2665
+ "offsets": [
2666
+ 384
2667
+ ],
2668
+ "shape": [
2669
+ 384
2670
+ ],
2671
+ "filename_index": 1
2672
+ }
2673
+ ]
2674
+ },
2675
+ "h.9.ln_1.bias": {
2676
+ "type": "Distributed",
2677
+ "shape": [
2678
+ 768
2679
+ ],
2680
+ "dtype": "F32",
2681
+ "chunks": [
2682
+ {
2683
+ "offsets": [
2684
+ 0
2685
+ ],
2686
+ "shape": [
2687
+ 384
2688
+ ],
2689
+ "filename_index": 0
2690
+ },
2691
+ {
2692
+ "offsets": [
2693
+ 384
2694
+ ],
2695
+ "shape": [
2696
+ 384
2697
+ ],
2698
+ "filename_index": 1
2699
+ }
2700
+ ]
2701
+ },
2702
+ "h.3.ln_1.weight": {
2703
+ "type": "Distributed",
2704
+ "shape": [
2705
+ 768
2706
+ ],
2707
+ "dtype": "F32",
2708
+ "chunks": [
2709
+ {
2710
+ "offsets": [
2711
+ 0
2712
+ ],
2713
+ "shape": [
2714
+ 384
2715
+ ],
2716
+ "filename_index": 0
2717
+ },
2718
+ {
2719
+ "offsets": [
2720
+ 384
2721
+ ],
2722
+ "shape": [
2723
+ 384
2724
+ ],
2725
+ "filename_index": 1
2726
+ }
2727
+ ]
2728
+ },
2729
+ "h.8.ln_2.weight": {
2730
+ "type": "Distributed",
2731
+ "shape": [
2732
+ 768
2733
+ ],
2734
+ "dtype": "F32",
2735
+ "chunks": [
2736
+ {
2737
+ "offsets": [
2738
+ 0
2739
+ ],
2740
+ "shape": [
2741
+ 384
2742
+ ],
2743
+ "filename_index": 0
2744
+ },
2745
+ {
2746
+ "offsets": [
2747
+ 384
2748
+ ],
2749
+ "shape": [
2750
+ 384
2751
+ ],
2752
+ "filename_index": 1
2753
+ }
2754
+ ]
2755
+ },
2756
+ "h.4.attn.c_attn.bias": {
2757
+ "type": "Distributed",
2758
+ "shape": [
2759
+ 2304
2760
+ ],
2761
+ "dtype": "F32",
2762
+ "chunks": [
2763
+ {
2764
+ "offsets": [
2765
+ 0
2766
+ ],
2767
+ "shape": [
2768
+ 1152
2769
+ ],
2770
+ "filename_index": 0
2771
+ },
2772
+ {
2773
+ "offsets": [
2774
+ 1152
2775
+ ],
2776
+ "shape": [
2777
+ 1152
2778
+ ],
2779
+ "filename_index": 1
2780
+ }
2781
+ ]
2782
+ },
2783
+ "h.6.mlp.c_fc.bias": {
2784
+ "type": "Distributed",
2785
+ "shape": [
2786
+ 3072
2787
+ ],
2788
+ "dtype": "F32",
2789
+ "chunks": [
2790
+ {
2791
+ "offsets": [
2792
+ 0
2793
+ ],
2794
+ "shape": [
2795
+ 1536
2796
+ ],
2797
+ "filename_index": 0
2798
+ },
2799
+ {
2800
+ "offsets": [
2801
+ 1536
2802
+ ],
2803
+ "shape": [
2804
+ 1536
2805
+ ],
2806
+ "filename_index": 1
2807
+ }
2808
+ ]
2809
+ },
2810
+ "h.2.mlp.c_fc.bias": {
2811
+ "type": "Distributed",
2812
+ "shape": [
2813
+ 3072
2814
+ ],
2815
+ "dtype": "F32",
2816
+ "chunks": [
2817
+ {
2818
+ "offsets": [
2819
+ 0
2820
+ ],
2821
+ "shape": [
2822
+ 1536
2823
+ ],
2824
+ "filename_index": 0
2825
+ },
2826
+ {
2827
+ "offsets": [
2828
+ 1536
2829
+ ],
2830
+ "shape": [
2831
+ 1536
2832
+ ],
2833
+ "filename_index": 1
2834
+ }
2835
+ ]
2836
+ },
2837
+ "h.8.mlp.c_proj.weight": {
2838
+ "type": "Distributed",
2839
+ "shape": [
2840
+ 3072,
2841
+ 768
2842
+ ],
2843
+ "dtype": "F32",
2844
+ "chunks": [
2845
+ {
2846
+ "offsets": [
2847
+ 0,
2848
+ 0
2849
+ ],
2850
+ "shape": [
2851
+ 1536,
2852
+ 768
2853
+ ],
2854
+ "filename_index": 0
2855
+ },
2856
+ {
2857
+ "offsets": [
2858
+ 1536,
2859
+ 0
2860
+ ],
2861
+ "shape": [
2862
+ 1536,
2863
+ 768
2864
+ ],
2865
+ "filename_index": 1
2866
+ }
2867
+ ]
2868
+ },
2869
+ "h.11.ln_2.bias": {
2870
+ "type": "Distributed",
2871
+ "shape": [
2872
+ 768
2873
+ ],
2874
+ "dtype": "F32",
2875
+ "chunks": [
2876
+ {
2877
+ "offsets": [
2878
+ 0
2879
+ ],
2880
+ "shape": [
2881
+ 384
2882
+ ],
2883
+ "filename_index": 0
2884
+ },
2885
+ {
2886
+ "offsets": [
2887
+ 384
2888
+ ],
2889
+ "shape": [
2890
+ 384
2891
+ ],
2892
+ "filename_index": 1
2893
+ }
2894
+ ]
2895
+ },
2896
+ "h.1.ln_2.weight": {
2897
+ "type": "Distributed",
2898
+ "shape": [
2899
+ 768
2900
+ ],
2901
+ "dtype": "F32",
2902
+ "chunks": [
2903
+ {
2904
+ "offsets": [
2905
+ 0
2906
+ ],
2907
+ "shape": [
2908
+ 384
2909
+ ],
2910
+ "filename_index": 0
2911
+ },
2912
+ {
2913
+ "offsets": [
2914
+ 384
2915
+ ],
2916
+ "shape": [
2917
+ 384
2918
+ ],
2919
+ "filename_index": 1
2920
+ }
2921
+ ]
2922
+ },
2923
+ "wte.weight": {
2924
+ "type": "Distributed",
2925
+ "shape": [
2926
+ 50257,
2927
+ 768
2928
+ ],
2929
+ "dtype": "F32",
2930
+ "chunks": [
2931
+ {
2932
+ "offsets": [
2933
+ 0,
2934
+ 0
2935
+ ],
2936
+ "shape": [
2937
+ 50257,
2938
+ 384
2939
+ ],
2940
+ "filename_index": 0
2941
+ },
2942
+ {
2943
+ "offsets": [
2944
+ 0,
2945
+ 384
2946
+ ],
2947
+ "shape": [
2948
+ 50257,
2949
+ 384
2950
+ ],
2951
+ "filename_index": 1
2952
+ }
2953
+ ]
2954
+ },
2955
+ "h.7.ln_1.bias": {
2956
+ "type": "Distributed",
2957
+ "shape": [
2958
+ 768
2959
+ ],
2960
+ "dtype": "F32",
2961
+ "chunks": [
2962
+ {
2963
+ "offsets": [
2964
+ 0
2965
+ ],
2966
+ "shape": [
2967
+ 384
2968
+ ],
2969
+ "filename_index": 0
2970
+ },
2971
+ {
2972
+ "offsets": [
2973
+ 384
2974
+ ],
2975
+ "shape": [
2976
+ 384
2977
+ ],
2978
+ "filename_index": 1
2979
+ }
2980
+ ]
2981
+ },
2982
+ "h.6.attn.c_attn.bias": {
2983
+ "type": "Distributed",
2984
+ "shape": [
2985
+ 2304
2986
+ ],
2987
+ "dtype": "F32",
2988
+ "chunks": [
2989
+ {
2990
+ "offsets": [
2991
+ 0
2992
+ ],
2993
+ "shape": [
2994
+ 1152
2995
+ ],
2996
+ "filename_index": 0
2997
+ },
2998
+ {
2999
+ "offsets": [
3000
+ 1152
3001
+ ],
3002
+ "shape": [
3003
+ 1152
3004
+ ],
3005
+ "filename_index": 1
3006
+ }
3007
+ ]
3008
+ },
3009
+ "h.8.ln_1.bias": {
3010
+ "type": "Distributed",
3011
+ "shape": [
3012
+ 768
3013
+ ],
3014
+ "dtype": "F32",
3015
+ "chunks": [
3016
+ {
3017
+ "offsets": [
3018
+ 0
3019
+ ],
3020
+ "shape": [
3021
+ 384
3022
+ ],
3023
+ "filename_index": 0
3024
+ },
3025
+ {
3026
+ "offsets": [
3027
+ 384
3028
+ ],
3029
+ "shape": [
3030
+ 384
3031
+ ],
3032
+ "filename_index": 1
3033
+ }
3034
+ ]
3035
+ },
3036
+ "h.0.attn.c_attn.weight": {
3037
+ "type": "Distributed",
3038
+ "shape": [
3039
+ 768,
3040
+ 2304
3041
+ ],
3042
+ "dtype": "F32",
3043
+ "chunks": [
3044
+ {
3045
+ "offsets": [
3046
+ 0,
3047
+ 0
3048
+ ],
3049
+ "shape": [
3050
+ 768,
3051
+ 1152
3052
+ ],
3053
+ "filename_index": 0
3054
+ },
3055
+ {
3056
+ "offsets": [
3057
+ 0,
3058
+ 1152
3059
+ ],
3060
+ "shape": [
3061
+ 768,
3062
+ 1152
3063
+ ],
3064
+ "filename_index": 1
3065
+ }
3066
+ ]
3067
+ },
3068
+ "h.11.ln_2.weight": {
3069
+ "type": "Distributed",
3070
+ "shape": [
3071
+ 768
3072
+ ],
3073
+ "dtype": "F32",
3074
+ "chunks": [
3075
+ {
3076
+ "offsets": [
3077
+ 0
3078
+ ],
3079
+ "shape": [
3080
+ 384
3081
+ ],
3082
+ "filename_index": 0
3083
+ },
3084
+ {
3085
+ "offsets": [
3086
+ 384
3087
+ ],
3088
+ "shape": [
3089
+ 384
3090
+ ],
3091
+ "filename_index": 1
3092
+ }
3093
+ ]
3094
+ },
3095
+ "h.4.mlp.c_fc.weight": {
3096
+ "type": "Distributed",
3097
+ "shape": [
3098
+ 768,
3099
+ 3072
3100
+ ],
3101
+ "dtype": "F32",
3102
+ "chunks": [
3103
+ {
3104
+ "offsets": [
3105
+ 0,
3106
+ 0
3107
+ ],
3108
+ "shape": [
3109
+ 768,
3110
+ 1536
3111
+ ],
3112
+ "filename_index": 0
3113
+ },
3114
+ {
3115
+ "offsets": [
3116
+ 0,
3117
+ 1536
3118
+ ],
3119
+ "shape": [
3120
+ 768,
3121
+ 1536
3122
+ ],
3123
+ "filename_index": 1
3124
+ }
3125
+ ]
3126
+ },
3127
+ "h.9.attn.bias": {
3128
+ "type": "Distributed",
3129
+ "shape": [
3130
+ 1,
3131
+ 1,
3132
+ 1024,
3133
+ 1024
3134
+ ],
3135
+ "dtype": "F32",
3136
+ "chunks": [
3137
+ {
3138
+ "offsets": [
3139
+ 0,
3140
+ 0,
3141
+ 0,
3142
+ 0
3143
+ ],
3144
+ "shape": [
3145
+ 1,
3146
+ 1,
3147
+ 1024,
3148
+ 512
3149
+ ],
3150
+ "filename_index": 0
3151
+ },
3152
+ {
3153
+ "offsets": [
3154
+ 0,
3155
+ 0,
3156
+ 0,
3157
+ 512
3158
+ ],
3159
+ "shape": [
3160
+ 1,
3161
+ 1,
3162
+ 1024,
3163
+ 512
3164
+ ],
3165
+ "filename_index": 1
3166
+ }
3167
+ ]
3168
+ },
3169
+ "h.6.attn.c_proj.weight": {
3170
+ "type": "Distributed",
3171
+ "shape": [
3172
+ 768,
3173
+ 768
3174
+ ],
3175
+ "dtype": "F32",
3176
+ "chunks": [
3177
+ {
3178
+ "offsets": [
3179
+ 0,
3180
+ 0
3181
+ ],
3182
+ "shape": [
3183
+ 384,
3184
+ 768
3185
+ ],
3186
+ "filename_index": 0
3187
+ },
3188
+ {
3189
+ "offsets": [
3190
+ 384,
3191
+ 0
3192
+ ],
3193
+ "shape": [
3194
+ 384,
3195
+ 768
3196
+ ],
3197
+ "filename_index": 1
3198
+ }
3199
+ ]
3200
+ },
3201
+ "h.5.mlp.c_fc.bias": {
3202
+ "type": "Distributed",
3203
+ "shape": [
3204
+ 3072
3205
+ ],
3206
+ "dtype": "F32",
3207
+ "chunks": [
3208
+ {
3209
+ "offsets": [
3210
+ 0
3211
+ ],
3212
+ "shape": [
3213
+ 1536
3214
+ ],
3215
+ "filename_index": 0
3216
+ },
3217
+ {
3218
+ "offsets": [
3219
+ 1536
3220
+ ],
3221
+ "shape": [
3222
+ 1536
3223
+ ],
3224
+ "filename_index": 1
3225
+ }
3226
+ ]
3227
+ },
3228
+ "h.4.attn.bias": {
3229
+ "type": "Distributed",
3230
+ "shape": [
3231
+ 1,
3232
+ 1,
3233
+ 1024,
3234
+ 1024
3235
+ ],
3236
+ "dtype": "F32",
3237
+ "chunks": [
3238
+ {
3239
+ "offsets": [
3240
+ 0,
3241
+ 0,
3242
+ 0,
3243
+ 0
3244
+ ],
3245
+ "shape": [
3246
+ 1,
3247
+ 1,
3248
+ 1024,
3249
+ 512
3250
+ ],
3251
+ "filename_index": 0
3252
+ },
3253
+ {
3254
+ "offsets": [
3255
+ 0,
3256
+ 0,
3257
+ 0,
3258
+ 512
3259
+ ],
3260
+ "shape": [
3261
+ 1,
3262
+ 1,
3263
+ 1024,
3264
+ 512
3265
+ ],
3266
+ "filename_index": 1
3267
+ }
3268
+ ]
3269
+ },
3270
+ "h.2.attn.c_proj.weight": {
3271
+ "type": "Distributed",
3272
+ "shape": [
3273
+ 768,
3274
+ 768
3275
+ ],
3276
+ "dtype": "F32",
3277
+ "chunks": [
3278
+ {
3279
+ "offsets": [
3280
+ 0,
3281
+ 0
3282
+ ],
3283
+ "shape": [
3284
+ 384,
3285
+ 768
3286
+ ],
3287
+ "filename_index": 0
3288
+ },
3289
+ {
3290
+ "offsets": [
3291
+ 384,
3292
+ 0
3293
+ ],
3294
+ "shape": [
3295
+ 384,
3296
+ 768
3297
+ ],
3298
+ "filename_index": 1
3299
+ }
3300
+ ]
3301
+ },
3302
+ "h.1.mlp.c_fc.weight": {
3303
+ "type": "Distributed",
3304
+ "shape": [
3305
+ 768,
3306
+ 3072
3307
+ ],
3308
+ "dtype": "F32",
3309
+ "chunks": [
3310
+ {
3311
+ "offsets": [
3312
+ 0,
3313
+ 0
3314
+ ],
3315
+ "shape": [
3316
+ 768,
3317
+ 1536
3318
+ ],
3319
+ "filename_index": 0
3320
+ },
3321
+ {
3322
+ "offsets": [
3323
+ 0,
3324
+ 1536
3325
+ ],
3326
+ "shape": [
3327
+ 768,
3328
+ 1536
3329
+ ],
3330
+ "filename_index": 1
3331
+ }
3332
+ ]
3333
+ },
3334
+ "h.8.mlp.c_fc.bias": {
3335
+ "type": "Distributed",
3336
+ "shape": [
3337
+ 3072
3338
+ ],
3339
+ "dtype": "F32",
3340
+ "chunks": [
3341
+ {
3342
+ "offsets": [
3343
+ 0
3344
+ ],
3345
+ "shape": [
3346
+ 1536
3347
+ ],
3348
+ "filename_index": 0
3349
+ },
3350
+ {
3351
+ "offsets": [
3352
+ 1536
3353
+ ],
3354
+ "shape": [
3355
+ 1536
3356
+ ],
3357
+ "filename_index": 1
3358
+ }
3359
+ ]
3360
+ },
3361
+ "h.9.mlp.c_fc.bias": {
3362
+ "type": "Distributed",
3363
+ "shape": [
3364
+ 3072
3365
+ ],
3366
+ "dtype": "F32",
3367
+ "chunks": [
3368
+ {
3369
+ "offsets": [
3370
+ 0
3371
+ ],
3372
+ "shape": [
3373
+ 1536
3374
+ ],
3375
+ "filename_index": 0
3376
+ },
3377
+ {
3378
+ "offsets": [
3379
+ 1536
3380
+ ],
3381
+ "shape": [
3382
+ 1536
3383
+ ],
3384
+ "filename_index": 1
3385
+ }
3386
+ ]
3387
+ },
3388
+ "h.8.mlp.c_fc.weight": {
3389
+ "type": "Distributed",
3390
+ "shape": [
3391
+ 768,
3392
+ 3072
3393
+ ],
3394
+ "dtype": "F32",
3395
+ "chunks": [
3396
+ {
3397
+ "offsets": [
3398
+ 0,
3399
+ 0
3400
+ ],
3401
+ "shape": [
3402
+ 768,
3403
+ 1536
3404
+ ],
3405
+ "filename_index": 0
3406
+ },
3407
+ {
3408
+ "offsets": [
3409
+ 0,
3410
+ 1536
3411
+ ],
3412
+ "shape": [
3413
+ 768,
3414
+ 1536
3415
+ ],
3416
+ "filename_index": 1
3417
+ }
3418
+ ]
3419
+ },
3420
+ "h.8.attn.c_attn.weight": {
3421
+ "type": "Distributed",
3422
+ "shape": [
3423
+ 768,
3424
+ 2304
3425
+ ],
3426
+ "dtype": "F32",
3427
+ "chunks": [
3428
+ {
3429
+ "offsets": [
3430
+ 0,
3431
+ 0
3432
+ ],
3433
+ "shape": [
3434
+ 768,
3435
+ 1152
3436
+ ],
3437
+ "filename_index": 0
3438
+ },
3439
+ {
3440
+ "offsets": [
3441
+ 0,
3442
+ 1152
3443
+ ],
3444
+ "shape": [
3445
+ 768,
3446
+ 1152
3447
+ ],
3448
+ "filename_index": 1
3449
+ }
3450
+ ]
3451
+ },
3452
+ "h.0.ln_1.bias": {
3453
+ "type": "Distributed",
3454
+ "shape": [
3455
+ 768
3456
+ ],
3457
+ "dtype": "F32",
3458
+ "chunks": [
3459
+ {
3460
+ "offsets": [
3461
+ 0
3462
+ ],
3463
+ "shape": [
3464
+ 384
3465
+ ],
3466
+ "filename_index": 0
3467
+ },
3468
+ {
3469
+ "offsets": [
3470
+ 384
3471
+ ],
3472
+ "shape": [
3473
+ 384
3474
+ ],
3475
+ "filename_index": 1
3476
+ }
3477
+ ]
3478
+ },
3479
+ "h.1.ln_1.weight": {
3480
+ "type": "Distributed",
3481
+ "shape": [
3482
+ 768
3483
+ ],
3484
+ "dtype": "F32",
3485
+ "chunks": [
3486
+ {
3487
+ "offsets": [
3488
+ 0
3489
+ ],
3490
+ "shape": [
3491
+ 384
3492
+ ],
3493
+ "filename_index": 0
3494
+ },
3495
+ {
3496
+ "offsets": [
3497
+ 384
3498
+ ],
3499
+ "shape": [
3500
+ 384
3501
+ ],
3502
+ "filename_index": 1
3503
+ }
3504
+ ]
3505
+ },
3506
+ "h.5.attn.c_attn.weight": {
3507
+ "type": "Distributed",
3508
+ "shape": [
3509
+ 768,
3510
+ 2304
3511
+ ],
3512
+ "dtype": "F32",
3513
+ "chunks": [
3514
+ {
3515
+ "offsets": [
3516
+ 0,
3517
+ 0
3518
+ ],
3519
+ "shape": [
3520
+ 768,
3521
+ 1152
3522
+ ],
3523
+ "filename_index": 0
3524
+ },
3525
+ {
3526
+ "offsets": [
3527
+ 0,
3528
+ 1152
3529
+ ],
3530
+ "shape": [
3531
+ 768,
3532
+ 1152
3533
+ ],
3534
+ "filename_index": 1
3535
+ }
3536
+ ]
3537
+ },
3538
+ "h.3.attn.c_attn.bias": {
3539
+ "type": "Distributed",
3540
+ "shape": [
3541
+ 2304
3542
+ ],
3543
+ "dtype": "F32",
3544
+ "chunks": [
3545
+ {
3546
+ "offsets": [
3547
+ 0
3548
+ ],
3549
+ "shape": [
3550
+ 1152
3551
+ ],
3552
+ "filename_index": 0
3553
+ },
3554
+ {
3555
+ "offsets": [
3556
+ 1152
3557
+ ],
3558
+ "shape": [
3559
+ 1152
3560
+ ],
3561
+ "filename_index": 1
3562
+ }
3563
+ ]
3564
+ },
3565
+ "h.4.ln_2.weight": {
3566
+ "type": "Distributed",
3567
+ "shape": [
3568
+ 768
3569
+ ],
3570
+ "dtype": "F32",
3571
+ "chunks": [
3572
+ {
3573
+ "offsets": [
3574
+ 0
3575
+ ],
3576
+ "shape": [
3577
+ 384
3578
+ ],
3579
+ "filename_index": 0
3580
+ },
3581
+ {
3582
+ "offsets": [
3583
+ 384
3584
+ ],
3585
+ "shape": [
3586
+ 384
3587
+ ],
3588
+ "filename_index": 1
3589
+ }
3590
+ ]
3591
+ },
3592
+ "h.7.attn.c_proj.weight": {
3593
+ "type": "Distributed",
3594
+ "shape": [
3595
+ 768,
3596
+ 768
3597
+ ],
3598
+ "dtype": "F32",
3599
+ "chunks": [
3600
+ {
3601
+ "offsets": [
3602
+ 0,
3603
+ 0
3604
+ ],
3605
+ "shape": [
3606
+ 384,
3607
+ 768
3608
+ ],
3609
+ "filename_index": 0
3610
+ },
3611
+ {
3612
+ "offsets": [
3613
+ 384,
3614
+ 0
3615
+ ],
3616
+ "shape": [
3617
+ 384,
3618
+ 768
3619
+ ],
3620
+ "filename_index": 1
3621
+ }
3622
+ ]
3623
+ },
3624
+ "h.1.ln_2.bias": {
3625
+ "type": "Distributed",
3626
+ "shape": [
3627
+ 768
3628
+ ],
3629
+ "dtype": "F32",
3630
+ "chunks": [
3631
+ {
3632
+ "offsets": [
3633
+ 0
3634
+ ],
3635
+ "shape": [
3636
+ 384
3637
+ ],
3638
+ "filename_index": 0
3639
+ },
3640
+ {
3641
+ "offsets": [
3642
+ 384
3643
+ ],
3644
+ "shape": [
3645
+ 384
3646
+ ],
3647
+ "filename_index": 1
3648
+ }
3649
+ ]
3650
+ },
3651
+ "h.11.mlp.c_fc.weight": {
3652
+ "type": "Distributed",
3653
+ "shape": [
3654
+ 768,
3655
+ 3072
3656
+ ],
3657
+ "dtype": "F32",
3658
+ "chunks": [
3659
+ {
3660
+ "offsets": [
3661
+ 0,
3662
+ 0
3663
+ ],
3664
+ "shape": [
3665
+ 768,
3666
+ 1536
3667
+ ],
3668
+ "filename_index": 0
3669
+ },
3670
+ {
3671
+ "offsets": [
3672
+ 0,
3673
+ 1536
3674
+ ],
3675
+ "shape": [
3676
+ 768,
3677
+ 1536
3678
+ ],
3679
+ "filename_index": 1
3680
+ }
3681
+ ]
3682
+ },
3683
+ "h.6.attn.c_attn.weight": {
3684
+ "type": "Distributed",
3685
+ "shape": [
3686
+ 768,
3687
+ 2304
3688
+ ],
3689
+ "dtype": "F32",
3690
+ "chunks": [
3691
+ {
3692
+ "offsets": [
3693
+ 0,
3694
+ 0
3695
+ ],
3696
+ "shape": [
3697
+ 768,
3698
+ 1152
3699
+ ],
3700
+ "filename_index": 0
3701
+ },
3702
+ {
3703
+ "offsets": [
3704
+ 0,
3705
+ 1152
3706
+ ],
3707
+ "shape": [
3708
+ 768,
3709
+ 1152
3710
+ ],
3711
+ "filename_index": 1
3712
+ }
3713
+ ]
3714
+ },
3715
+ "h.11.attn.c_attn.weight": {
3716
+ "type": "Distributed",
3717
+ "shape": [
3718
+ 768,
3719
+ 2304
3720
+ ],
3721
+ "dtype": "F32",
3722
+ "chunks": [
3723
+ {
3724
+ "offsets": [
3725
+ 0,
3726
+ 0
3727
+ ],
3728
+ "shape": [
3729
+ 768,
3730
+ 1152
3731
+ ],
3732
+ "filename_index": 0
3733
+ },
3734
+ {
3735
+ "offsets": [
3736
+ 0,
3737
+ 1152
3738
+ ],
3739
+ "shape": [
3740
+ 768,
3741
+ 1152
3742
+ ],
3743
+ "filename_index": 1
3744
+ }
3745
+ ]
3746
+ },
3747
+ "h.0.mlp.c_proj.weight": {
3748
+ "type": "Distributed",
3749
+ "shape": [
3750
+ 3072,
3751
+ 768
3752
+ ],
3753
+ "dtype": "F32",
3754
+ "chunks": [
3755
+ {
3756
+ "offsets": [
3757
+ 0,
3758
+ 0
3759
+ ],
3760
+ "shape": [
3761
+ 1536,
3762
+ 768
3763
+ ],
3764
+ "filename_index": 0
3765
+ },
3766
+ {
3767
+ "offsets": [
3768
+ 1536,
3769
+ 0
3770
+ ],
3771
+ "shape": [
3772
+ 1536,
3773
+ 768
3774
+ ],
3775
+ "filename_index": 1
3776
+ }
3777
+ ]
3778
+ },
3779
+ "h.5.mlp.c_proj.bias": {
3780
+ "type": "Distributed",
3781
+ "shape": [
3782
+ 768
3783
+ ],
3784
+ "dtype": "F32",
3785
+ "chunks": [
3786
+ {
3787
+ "offsets": [
3788
+ 0
3789
+ ],
3790
+ "shape": [
3791
+ 384
3792
+ ],
3793
+ "filename_index": 0
3794
+ },
3795
+ {
3796
+ "offsets": [
3797
+ 384
3798
+ ],
3799
+ "shape": [
3800
+ 384
3801
+ ],
3802
+ "filename_index": 1
3803
+ }
3804
+ ]
3805
+ },
3806
+ "h.1.attn.bias": {
3807
+ "type": "Distributed",
3808
+ "shape": [
3809
+ 1,
3810
+ 1,
3811
+ 1024,
3812
+ 1024
3813
+ ],
3814
+ "dtype": "F32",
3815
+ "chunks": [
3816
+ {
3817
+ "offsets": [
3818
+ 0,
3819
+ 0,
3820
+ 0,
3821
+ 0
3822
+ ],
3823
+ "shape": [
3824
+ 1,
3825
+ 1,
3826
+ 1024,
3827
+ 512
3828
+ ],
3829
+ "filename_index": 0
3830
+ },
3831
+ {
3832
+ "offsets": [
3833
+ 0,
3834
+ 0,
3835
+ 0,
3836
+ 512
3837
+ ],
3838
+ "shape": [
3839
+ 1,
3840
+ 1,
3841
+ 1024,
3842
+ 512
3843
+ ],
3844
+ "filename_index": 1
3845
+ }
3846
+ ]
3847
+ },
3848
+ "h.11.mlp.c_proj.weight": {
3849
+ "type": "Distributed",
3850
+ "shape": [
3851
+ 3072,
3852
+ 768
3853
+ ],
3854
+ "dtype": "F32",
3855
+ "chunks": [
3856
+ {
3857
+ "offsets": [
3858
+ 0,
3859
+ 0
3860
+ ],
3861
+ "shape": [
3862
+ 1536,
3863
+ 768
3864
+ ],
3865
+ "filename_index": 0
3866
+ },
3867
+ {
3868
+ "offsets": [
3869
+ 1536,
3870
+ 0
3871
+ ],
3872
+ "shape": [
3873
+ 1536,
3874
+ 768
3875
+ ],
3876
+ "filename_index": 1
3877
+ }
3878
+ ]
3879
+ },
3880
+ "h.10.mlp.c_fc.weight": {
3881
+ "type": "Distributed",
3882
+ "shape": [
3883
+ 768,
3884
+ 3072
3885
+ ],
3886
+ "dtype": "F32",
3887
+ "chunks": [
3888
+ {
3889
+ "offsets": [
3890
+ 0,
3891
+ 0
3892
+ ],
3893
+ "shape": [
3894
+ 768,
3895
+ 1536
3896
+ ],
3897
+ "filename_index": 0
3898
+ },
3899
+ {
3900
+ "offsets": [
3901
+ 0,
3902
+ 1536
3903
+ ],
3904
+ "shape": [
3905
+ 768,
3906
+ 1536
3907
+ ],
3908
+ "filename_index": 1
3909
+ }
3910
+ ]
3911
+ },
3912
+ "h.2.attn.c_proj.bias": {
3913
+ "type": "Distributed",
3914
+ "shape": [
3915
+ 768
3916
+ ],
3917
+ "dtype": "F32",
3918
+ "chunks": [
3919
+ {
3920
+ "offsets": [
3921
+ 0
3922
+ ],
3923
+ "shape": [
3924
+ 384
3925
+ ],
3926
+ "filename_index": 0
3927
+ },
3928
+ {
3929
+ "offsets": [
3930
+ 384
3931
+ ],
3932
+ "shape": [
3933
+ 384
3934
+ ],
3935
+ "filename_index": 1
3936
+ }
3937
+ ]
3938
+ },
3939
+ "h.0.mlp.c_fc.weight": {
3940
+ "type": "Distributed",
3941
+ "shape": [
3942
+ 768,
3943
+ 3072
3944
+ ],
3945
+ "dtype": "F32",
3946
+ "chunks": [
3947
+ {
3948
+ "offsets": [
3949
+ 0,
3950
+ 0
3951
+ ],
3952
+ "shape": [
3953
+ 768,
3954
+ 1536
3955
+ ],
3956
+ "filename_index": 0
3957
+ },
3958
+ {
3959
+ "offsets": [
3960
+ 0,
3961
+ 1536
3962
+ ],
3963
+ "shape": [
3964
+ 768,
3965
+ 1536
3966
+ ],
3967
+ "filename_index": 1
3968
+ }
3969
+ ]
3970
+ },
3971
+ "h.10.mlp.c_proj.bias": {
3972
+ "type": "Distributed",
3973
+ "shape": [
3974
+ 768
3975
+ ],
3976
+ "dtype": "F32",
3977
+ "chunks": [
3978
+ {
3979
+ "offsets": [
3980
+ 0
3981
+ ],
3982
+ "shape": [
3983
+ 384
3984
+ ],
3985
+ "filename_index": 0
3986
+ },
3987
+ {
3988
+ "offsets": [
3989
+ 384
3990
+ ],
3991
+ "shape": [
3992
+ 384
3993
+ ],
3994
+ "filename_index": 1
3995
+ }
3996
+ ]
3997
+ },
3998
+ "h.2.mlp.c_proj.weight": {
3999
+ "type": "Distributed",
4000
+ "shape": [
4001
+ 3072,
4002
+ 768
4003
+ ],
4004
+ "dtype": "F32",
4005
+ "chunks": [
4006
+ {
4007
+ "offsets": [
4008
+ 0,
4009
+ 0
4010
+ ],
4011
+ "shape": [
4012
+ 1536,
4013
+ 768
4014
+ ],
4015
+ "filename_index": 0
4016
+ },
4017
+ {
4018
+ "offsets": [
4019
+ 1536,
4020
+ 0
4021
+ ],
4022
+ "shape": [
4023
+ 1536,
4024
+ 768
4025
+ ],
4026
+ "filename_index": 1
4027
+ }
4028
+ ]
4029
+ },
4030
+ "h.0.attn.c_attn.bias": {
4031
+ "type": "Distributed",
4032
+ "shape": [
4033
+ 2304
4034
+ ],
4035
+ "dtype": "F32",
4036
+ "chunks": [
4037
+ {
4038
+ "offsets": [
4039
+ 0
4040
+ ],
4041
+ "shape": [
4042
+ 1152
4043
+ ],
4044
+ "filename_index": 0
4045
+ },
4046
+ {
4047
+ "offsets": [
4048
+ 1152
4049
+ ],
4050
+ "shape": [
4051
+ 1152
4052
+ ],
4053
+ "filename_index": 1
4054
+ }
4055
+ ]
4056
+ },
4057
+ "h.5.mlp.c_fc.weight": {
4058
+ "type": "Distributed",
4059
+ "shape": [
4060
+ 768,
4061
+ 3072
4062
+ ],
4063
+ "dtype": "F32",
4064
+ "chunks": [
4065
+ {
4066
+ "offsets": [
4067
+ 0,
4068
+ 0
4069
+ ],
4070
+ "shape": [
4071
+ 768,
4072
+ 1536
4073
+ ],
4074
+ "filename_index": 0
4075
+ },
4076
+ {
4077
+ "offsets": [
4078
+ 0,
4079
+ 1536
4080
+ ],
4081
+ "shape": [
4082
+ 768,
4083
+ 1536
4084
+ ],
4085
+ "filename_index": 1
4086
+ }
4087
+ ]
4088
+ },
4089
+ "h.7.mlp.c_fc.weight": {
4090
+ "type": "Distributed",
4091
+ "shape": [
4092
+ 768,
4093
+ 3072
4094
+ ],
4095
+ "dtype": "F32",
4096
+ "chunks": [
4097
+ {
4098
+ "offsets": [
4099
+ 0,
4100
+ 0
4101
+ ],
4102
+ "shape": [
4103
+ 768,
4104
+ 1536
4105
+ ],
4106
+ "filename_index": 0
4107
+ },
4108
+ {
4109
+ "offsets": [
4110
+ 0,
4111
+ 1536
4112
+ ],
4113
+ "shape": [
4114
+ 768,
4115
+ 1536
4116
+ ],
4117
+ "filename_index": 1
4118
+ }
4119
+ ]
4120
+ },
4121
+ "h.4.ln_1.weight": {
4122
+ "type": "Distributed",
4123
+ "shape": [
4124
+ 768
4125
+ ],
4126
+ "dtype": "F32",
4127
+ "chunks": [
4128
+ {
4129
+ "offsets": [
4130
+ 0
4131
+ ],
4132
+ "shape": [
4133
+ 384
4134
+ ],
4135
+ "filename_index": 0
4136
+ },
4137
+ {
4138
+ "offsets": [
4139
+ 384
4140
+ ],
4141
+ "shape": [
4142
+ 384
4143
+ ],
4144
+ "filename_index": 1
4145
+ }
4146
+ ]
4147
+ },
4148
+ "h.3.mlp.c_proj.weight": {
4149
+ "type": "Distributed",
4150
+ "shape": [
4151
+ 3072,
4152
+ 768
4153
+ ],
4154
+ "dtype": "F32",
4155
+ "chunks": [
4156
+ {
4157
+ "offsets": [
4158
+ 0,
4159
+ 0
4160
+ ],
4161
+ "shape": [
4162
+ 1536,
4163
+ 768
4164
+ ],
4165
+ "filename_index": 0
4166
+ },
4167
+ {
4168
+ "offsets": [
4169
+ 1536,
4170
+ 0
4171
+ ],
4172
+ "shape": [
4173
+ 1536,
4174
+ 768
4175
+ ],
4176
+ "filename_index": 1
4177
+ }
4178
+ ]
4179
+ },
4180
+ "h.2.ln_2.weight": {
4181
+ "type": "Distributed",
4182
+ "shape": [
4183
+ 768
4184
+ ],
4185
+ "dtype": "F32",
4186
+ "chunks": [
4187
+ {
4188
+ "offsets": [
4189
+ 0
4190
+ ],
4191
+ "shape": [
4192
+ 384
4193
+ ],
4194
+ "filename_index": 0
4195
+ },
4196
+ {
4197
+ "offsets": [
4198
+ 384
4199
+ ],
4200
+ "shape": [
4201
+ 384
4202
+ ],
4203
+ "filename_index": 1
4204
+ }
4205
+ ]
4206
+ },
4207
+ "h.3.mlp.c_proj.bias": {
4208
+ "type": "Distributed",
4209
+ "shape": [
4210
+ 768
4211
+ ],
4212
+ "dtype": "F32",
4213
+ "chunks": [
4214
+ {
4215
+ "offsets": [
4216
+ 0
4217
+ ],
4218
+ "shape": [
4219
+ 384
4220
+ ],
4221
+ "filename_index": 0
4222
+ },
4223
+ {
4224
+ "offsets": [
4225
+ 384
4226
+ ],
4227
+ "shape": [
4228
+ 384
4229
+ ],
4230
+ "filename_index": 1
4231
+ }
4232
+ ]
4233
+ },
4234
+ "h.5.mlp.c_proj.weight": {
4235
+ "type": "Distributed",
4236
+ "shape": [
4237
+ 3072,
4238
+ 768
4239
+ ],
4240
+ "dtype": "F32",
4241
+ "chunks": [
4242
+ {
4243
+ "offsets": [
4244
+ 0,
4245
+ 0
4246
+ ],
4247
+ "shape": [
4248
+ 1536,
4249
+ 768
4250
+ ],
4251
+ "filename_index": 0
4252
+ },
4253
+ {
4254
+ "offsets": [
4255
+ 1536,
4256
+ 0
4257
+ ],
4258
+ "shape": [
4259
+ 1536,
4260
+ 768
4261
+ ],
4262
+ "filename_index": 1
4263
+ }
4264
+ ]
4265
+ },
4266
+ "h.0.mlp.c_fc.bias": {
4267
+ "type": "Distributed",
4268
+ "shape": [
4269
+ 3072
4270
+ ],
4271
+ "dtype": "F32",
4272
+ "chunks": [
4273
+ {
4274
+ "offsets": [
4275
+ 0
4276
+ ],
4277
+ "shape": [
4278
+ 1536
4279
+ ],
4280
+ "filename_index": 0
4281
+ },
4282
+ {
4283
+ "offsets": [
4284
+ 1536
4285
+ ],
4286
+ "shape": [
4287
+ 1536
4288
+ ],
4289
+ "filename_index": 1
4290
+ }
4291
+ ]
4292
+ },
4293
+ "h.3.attn.c_attn.weight": {
4294
+ "type": "Distributed",
4295
+ "shape": [
4296
+ 768,
4297
+ 2304
4298
+ ],
4299
+ "dtype": "F32",
4300
+ "chunks": [
4301
+ {
4302
+ "offsets": [
4303
+ 0,
4304
+ 0
4305
+ ],
4306
+ "shape": [
4307
+ 768,
4308
+ 1152
4309
+ ],
4310
+ "filename_index": 0
4311
+ },
4312
+ {
4313
+ "offsets": [
4314
+ 0,
4315
+ 1152
4316
+ ],
4317
+ "shape": [
4318
+ 768,
4319
+ 1152
4320
+ ],
4321
+ "filename_index": 1
4322
+ }
4323
+ ]
4324
+ },
4325
+ "h.10.ln_2.weight": {
4326
+ "type": "Distributed",
4327
+ "shape": [
4328
+ 768
4329
+ ],
4330
+ "dtype": "F32",
4331
+ "chunks": [
4332
+ {
4333
+ "offsets": [
4334
+ 0
4335
+ ],
4336
+ "shape": [
4337
+ 384
4338
+ ],
4339
+ "filename_index": 0
4340
+ },
4341
+ {
4342
+ "offsets": [
4343
+ 384
4344
+ ],
4345
+ "shape": [
4346
+ 384
4347
+ ],
4348
+ "filename_index": 1
4349
+ }
4350
+ ]
4351
+ },
4352
+ "h.11.attn.bias": {
4353
+ "type": "Distributed",
4354
+ "shape": [
4355
+ 1,
4356
+ 1,
4357
+ 1024,
4358
+ 1024
4359
+ ],
4360
+ "dtype": "F32",
4361
+ "chunks": [
4362
+ {
4363
+ "offsets": [
4364
+ 0,
4365
+ 0,
4366
+ 0,
4367
+ 0
4368
+ ],
4369
+ "shape": [
4370
+ 1,
4371
+ 1,
4372
+ 1024,
4373
+ 512
4374
+ ],
4375
+ "filename_index": 0
4376
+ },
4377
+ {
4378
+ "offsets": [
4379
+ 0,
4380
+ 0,
4381
+ 0,
4382
+ 512
4383
+ ],
4384
+ "shape": [
4385
+ 1,
4386
+ 1,
4387
+ 1024,
4388
+ 512
4389
+ ],
4390
+ "filename_index": 1
4391
+ }
4392
+ ]
4393
+ },
4394
+ "h.4.attn.c_proj.bias": {
4395
+ "type": "Distributed",
4396
+ "shape": [
4397
+ 768
4398
+ ],
4399
+ "dtype": "F32",
4400
+ "chunks": [
4401
+ {
4402
+ "offsets": [
4403
+ 0
4404
+ ],
4405
+ "shape": [
4406
+ 384
4407
+ ],
4408
+ "filename_index": 0
4409
+ },
4410
+ {
4411
+ "offsets": [
4412
+ 384
4413
+ ],
4414
+ "shape": [
4415
+ 384
4416
+ ],
4417
+ "filename_index": 1
4418
+ }
4419
+ ]
4420
+ },
4421
+ "h.10.mlp.c_proj.weight": {
4422
+ "type": "Distributed",
4423
+ "shape": [
4424
+ 3072,
4425
+ 768
4426
+ ],
4427
+ "dtype": "F32",
4428
+ "chunks": [
4429
+ {
4430
+ "offsets": [
4431
+ 0,
4432
+ 0
4433
+ ],
4434
+ "shape": [
4435
+ 1536,
4436
+ 768
4437
+ ],
4438
+ "filename_index": 0
4439
+ },
4440
+ {
4441
+ "offsets": [
4442
+ 1536,
4443
+ 0
4444
+ ],
4445
+ "shape": [
4446
+ 1536,
4447
+ 768
4448
+ ],
4449
+ "filename_index": 1
4450
+ }
4451
+ ]
4452
+ },
4453
+ "h.5.attn.c_proj.weight": {
4454
+ "type": "Distributed",
4455
+ "shape": [
4456
+ 768,
4457
+ 768
4458
+ ],
4459
+ "dtype": "F32",
4460
+ "chunks": [
4461
+ {
4462
+ "offsets": [
4463
+ 0,
4464
+ 0
4465
+ ],
4466
+ "shape": [
4467
+ 384,
4468
+ 768
4469
+ ],
4470
+ "filename_index": 0
4471
+ },
4472
+ {
4473
+ "offsets": [
4474
+ 384,
4475
+ 0
4476
+ ],
4477
+ "shape": [
4478
+ 384,
4479
+ 768
4480
+ ],
4481
+ "filename_index": 1
4482
+ }
4483
+ ]
4484
+ },
4485
+ "h.4.attn.c_attn.weight": {
4486
+ "type": "Distributed",
4487
+ "shape": [
4488
+ 768,
4489
+ 2304
4490
+ ],
4491
+ "dtype": "F32",
4492
+ "chunks": [
4493
+ {
4494
+ "offsets": [
4495
+ 0,
4496
+ 0
4497
+ ],
4498
+ "shape": [
4499
+ 768,
4500
+ 1152
4501
+ ],
4502
+ "filename_index": 0
4503
+ },
4504
+ {
4505
+ "offsets": [
4506
+ 0,
4507
+ 1152
4508
+ ],
4509
+ "shape": [
4510
+ 768,
4511
+ 1152
4512
+ ],
4513
+ "filename_index": 1
4514
+ }
4515
+ ]
4516
+ },
4517
+ "h.0.ln_2.weight": {
4518
+ "type": "Distributed",
4519
+ "shape": [
4520
+ 768
4521
+ ],
4522
+ "dtype": "F32",
4523
+ "chunks": [
4524
+ {
4525
+ "offsets": [
4526
+ 0
4527
+ ],
4528
+ "shape": [
4529
+ 384
4530
+ ],
4531
+ "filename_index": 0
4532
+ },
4533
+ {
4534
+ "offsets": [
4535
+ 384
4536
+ ],
4537
+ "shape": [
4538
+ 384
4539
+ ],
4540
+ "filename_index": 1
4541
+ }
4542
+ ]
4543
+ },
4544
+ "h.7.attn.c_proj.bias": {
4545
+ "type": "Distributed",
4546
+ "shape": [
4547
+ 768
4548
+ ],
4549
+ "dtype": "F32",
4550
+ "chunks": [
4551
+ {
4552
+ "offsets": [
4553
+ 0
4554
+ ],
4555
+ "shape": [
4556
+ 384
4557
+ ],
4558
+ "filename_index": 0
4559
+ },
4560
+ {
4561
+ "offsets": [
4562
+ 384
4563
+ ],
4564
+ "shape": [
4565
+ 384
4566
+ ],
4567
+ "filename_index": 1
4568
+ }
4569
+ ]
4570
+ },
4571
+ "h.3.attn.c_proj.bias": {
4572
+ "type": "Distributed",
4573
+ "shape": [
4574
+ 768
4575
+ ],
4576
+ "dtype": "F32",
4577
+ "chunks": [
4578
+ {
4579
+ "offsets": [
4580
+ 0
4581
+ ],
4582
+ "shape": [
4583
+ 384
4584
+ ],
4585
+ "filename_index": 0
4586
+ },
4587
+ {
4588
+ "offsets": [
4589
+ 384
4590
+ ],
4591
+ "shape": [
4592
+ 384
4593
+ ],
4594
+ "filename_index": 1
4595
+ }
4596
+ ]
4597
+ },
4598
+ "h.8.attn.bias": {
4599
+ "type": "Distributed",
4600
+ "shape": [
4601
+ 1,
4602
+ 1,
4603
+ 1024,
4604
+ 1024
4605
+ ],
4606
+ "dtype": "F32",
4607
+ "chunks": [
4608
+ {
4609
+ "offsets": [
4610
+ 0,
4611
+ 0,
4612
+ 0,
4613
+ 0
4614
+ ],
4615
+ "shape": [
4616
+ 1,
4617
+ 1,
4618
+ 1024,
4619
+ 512
4620
+ ],
4621
+ "filename_index": 0
4622
+ },
4623
+ {
4624
+ "offsets": [
4625
+ 0,
4626
+ 0,
4627
+ 0,
4628
+ 512
4629
+ ],
4630
+ "shape": [
4631
+ 1,
4632
+ 1,
4633
+ 1024,
4634
+ 512
4635
+ ],
4636
+ "filename_index": 1
4637
+ }
4638
+ ]
4639
+ },
4640
+ "h.11.mlp.c_fc.bias": {
4641
+ "type": "Distributed",
4642
+ "shape": [
4643
+ 3072
4644
+ ],
4645
+ "dtype": "F32",
4646
+ "chunks": [
4647
+ {
4648
+ "offsets": [
4649
+ 0
4650
+ ],
4651
+ "shape": [
4652
+ 1536
4653
+ ],
4654
+ "filename_index": 0
4655
+ },
4656
+ {
4657
+ "offsets": [
4658
+ 1536
4659
+ ],
4660
+ "shape": [
4661
+ 1536
4662
+ ],
4663
+ "filename_index": 1
4664
+ }
4665
+ ]
4666
+ },
4667
+ "h.7.mlp.c_fc.bias": {
4668
+ "type": "Distributed",
4669
+ "shape": [
4670
+ 3072
4671
+ ],
4672
+ "dtype": "F32",
4673
+ "chunks": [
4674
+ {
4675
+ "offsets": [
4676
+ 0
4677
+ ],
4678
+ "shape": [
4679
+ 1536
4680
+ ],
4681
+ "filename_index": 0
4682
+ },
4683
+ {
4684
+ "offsets": [
4685
+ 1536
4686
+ ],
4687
+ "shape": [
4688
+ 1536
4689
+ ],
4690
+ "filename_index": 1
4691
+ }
4692
+ ]
4693
+ },
4694
+ "h.5.ln_1.bias": {
4695
+ "type": "Distributed",
4696
+ "shape": [
4697
+ 768
4698
+ ],
4699
+ "dtype": "F32",
4700
+ "chunks": [
4701
+ {
4702
+ "offsets": [
4703
+ 0
4704
+ ],
4705
+ "shape": [
4706
+ 384
4707
+ ],
4708
+ "filename_index": 0
4709
+ },
4710
+ {
4711
+ "offsets": [
4712
+ 384
4713
+ ],
4714
+ "shape": [
4715
+ 384
4716
+ ],
4717
+ "filename_index": 1
4718
+ }
4719
+ ]
4720
+ },
4721
+ "h.6.mlp.c_proj.weight": {
4722
+ "type": "Distributed",
4723
+ "shape": [
4724
+ 3072,
4725
+ 768
4726
+ ],
4727
+ "dtype": "F32",
4728
+ "chunks": [
4729
+ {
4730
+ "offsets": [
4731
+ 0,
4732
+ 0
4733
+ ],
4734
+ "shape": [
4735
+ 1536,
4736
+ 768
4737
+ ],
4738
+ "filename_index": 0
4739
+ },
4740
+ {
4741
+ "offsets": [
4742
+ 1536,
4743
+ 0
4744
+ ],
4745
+ "shape": [
4746
+ 1536,
4747
+ 768
4748
+ ],
4749
+ "filename_index": 1
4750
+ }
4751
+ ]
4752
+ }
4753
+ },
4754
+ "filenames": [
4755
+ "rank0.safetensors",
4756
+ "rank1.safetensors"
4757
+ ],
4758
+ "n_ranks": 2
4759
+ }