Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
80
140
156
Elie Bakouch
eliebak
Follow
ebernabe's profile picture
jamesmanconvile42's profile picture
naturelizer's profile picture
270 followers
·
374 following
eliebakouch
eliebak
eliebak
eliebak.hf.co
AI & ML interests
Training LLM's @ 🤗
Recent Activity
liked
a model
about 22 hours ago
Motif-Technologies/Motif-2.6B
posted
an
update
about 22 hours ago
Motif 2.6B tech report is pretty insane, first time i see a model with differential attention and polynorm trained at scale! > It's trained on 2.5T of token, with a "data mixture schedule" to continuously adjust the mixture over training. > They use WSD with a "Simple moving average" averaging the last 6 ckpt every 8B token. > They trained on Finemath, Fineweb2, DCLM, TxT360. > Lot of details in the finetuning data they used, for instance they used EvolKit and did some "dataset fusion" to have more compressed knowledge into the data. > They mention they also tried Normalized GPT, QK-Norm and Cross Layer Attention. https://huggingface.co/Motif-Technologies/Motif-2.6B
liked
a model
about 23 hours ago
Motif-Technologies/activation
View all activity
Organizations
eliebak
's models
12
Sort: Recently updated
eliebak/SmolLM-360M-Instruct-Q8_0-GGUF
0.4B
•
Updated
Aug 13, 2024
•
13
eliebak/the-tokenizer-v1.5
Updated
Jul 4, 2024
eliebak/the-tokenizer-v2
Updated
Jun 17, 2024
eliebak/wsd_124M_300B_fw
Text Generation
•
0.1B
•
Updated
Jun 11, 2024
•
3
eliebak/wsd_124M_300B_edu
Text Generation
•
0.1B
•
Updated
Jun 11, 2024
•
3
eliebak/wsd_124M_150B_edu
Text Generation
•
0.1B
•
Updated
Jun 11, 2024
•
4
eliebak/wsd_124M_150B_fw
Text Generation
•
0.1B
•
Updated
Jun 11, 2024
•
3
eliebak/cos_124M_150B_fw
Text Generation
•
0.1B
•
Updated
Jun 9, 2024
•
3
eliebak/cos_124M_150B_edu
Text Generation
•
0.1B
•
Updated
Jun 9, 2024
•
4
eliebak/debug-cos-100B
Text Generation
•
0.1B
•
Updated
Jun 8, 2024
•
4
eliebak/dummy-model
Text Generation
•
0.1B
•
Updated
Jun 7, 2024
•
4
eliebak/smoltokenizer
Updated
Apr 30, 2024