Pretraining Data: English NOW corpus (english-corpora.org/now). Paper: arxiv.org/abs/2601.15236. Code: github.com/iamshnoo/metadata_localization
-
iamshnoo/combined_with_metadata_500m
Text Generation • 0.7B • Updated • 199 -
iamshnoo/combined_without_metadata_500m
Text Generation • 0.7B • Updated • 202 -
iamshnoo/combined_with_metadata_1b_step2k
Text Generation • 1B • Updated • 226 -
iamshnoo/combined_with_metadata_1b_step4k
Text Generation • 1B • Updated • 223