Loser Cheems's picture

Loser Cheems PRO

JingzeShi

·

https://github.com/LoserCheems

LoserCheems

AI & ML interests

I like training small languge models.

Recent Activity

updated a model about 21 hours ago

JingzeShi/OpenSeek-1.4B-A0.4B

new activity 9 days ago

SmallDoge/Doge-60M:ImportError: cannot import name 'LossKwargs' from 'transformers.utils'

updated a model 12 days ago

SmallDoge/Doge-40M-MoE-checkpoint

View all activity

Organizations

published an article 19 days ago

Article

Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models

By

and 2 others •

19 days ago

• 5