Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
80
140
156
Elie Bakouch
eliebak
Follow
VaidikML0508's profile picture
alkmseker's profile picture
loong's profile picture
270 followers
·
374 following
eliebakouch
eliebak
eliebak
eliebak.hf.co
AI & ML interests
Training LLM's @ 🤗
Recent Activity
liked
a model
about 18 hours ago
Motif-Technologies/Motif-2.6B
posted
an
update
about 19 hours ago
Motif 2.6B tech report is pretty insane, first time i see a model with differential attention and polynorm trained at scale! > It's trained on 2.5T of token, with a "data mixture schedule" to continuously adjust the mixture over training. > They use WSD with a "Simple moving average" averaging the last 6 ckpt every 8B token. > They trained on Finemath, Fineweb2, DCLM, TxT360. > Lot of details in the finetuning data they used, for instance they used EvolKit and did some "dataset fusion" to have more compressed knowledge into the data. > They mention they also tried Normalized GPT, QK-Norm and Cross Layer Attention. https://huggingface.co/Motif-Technologies/Motif-2.6B
liked
a model
about 19 hours ago
Motif-Technologies/activation
View all activity
Organizations
eliebak
's datasets
3
Sort: Recently updated
eliebak/very-smollm-corpus
Viewer
•
Updated
Sep 9, 2024
•
4.58M
•
2
•
2
eliebak/Buzz_wo_chatml_format
Viewer
•
Updated
Jun 25, 2024
•
31.2M
•
110
•
1
eliebak/Buzz_chatml_format
Viewer
•
Updated
Jun 15, 2024
•
31.2M
•
312