3 6 16

Yixuan Wei

EasonWei

weiyx16

AI & ML interests

None yet

Recent Activity

upvoted a paper 11 days ago

FP4 All the Way: Fully Quantized Training of LLMs

upvoted a paper 13 days ago

Group Sequence Policy Optimization

upvoted a paper 14 days ago

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

View all activity

Organizations

upvoted a paper 11 days ago

FP4 All the Way: Fully Quantized Training of LLMs

Paper • 2505.19115 • Published May 25 • 3

upvoted a paper 13 days ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 290

upvoted a paper 14 days ago

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Paper • 2506.05209 • Published Jun 5 • 46

liked a model about 1 month ago

moonshotai/Kimi-K2-Base

Text Generation • Updated Jul 13 • 7.07k • 264

upvoted a paper 3 months ago

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28 • 129

commented a paper 9 months ago

COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training

Paper • 2410.19313 • Published Oct 25, 2024 • 19 •

commented a paper 10 months ago

COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training

Paper • 2410.19313 • Published Oct 25, 2024 • 19 •

upvoted a paper 10 months ago

On Memorization of Large Language Models in Logical Reasoning

Paper • 2410.23123 • Published Oct 30, 2024 • 18

liked 2 models 11 months ago

stepfun-ai/GOT-OCR2_0

Image-Text-to-Text • 0.7B • Updated Feb 4 • 64.1k • 1.51k

black-forest-labs/FLUX.1-dev

Text-to-Image • Updated Jun 27 • 1.43M • • 11.2k

upvoted a collection 11 months ago

Qwen2.5

Collection

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 46 items • Updated Jul 21 • 632

liked a dataset 12 months ago

edinburgh-dawg/mmlu-redux

Viewer • Updated Feb 9 • 3k • 2.55k • 32

New activity in 1bitLLM/bitnet_b1_58-3B about 1 year ago

Is it bitnet {-1,0,1}?

#6 opened over 1 year ago by

Remek

authored a paper about 1 year ago

Xwin-LM: Strong and Scalable Alignment Practice for LLMs

Paper • 2405.20335 • Published May 30, 2024 • 18

authored a paper over 1 year ago

Common 7B Language Models Already Possess Strong Math Capabilities

Paper • 2403.04706 • Published Mar 7, 2024 • 21

liked a model over 1 year ago

adept/fuyu-8b

Image-Text-to-Text • 9B • Updated Nov 4, 2023 • 54.6k • 1.01k

liked a dataset over 1 year ago

fka/awesome-chatgpt-prompts

Viewer • Updated Jan 6 • 203 • 37.6k • 8.82k

liked a model almost 2 years ago

01-ai/Yi-6B

Text Generation • 6B • Updated Nov 11, 2024 • 14.3k • 374

authored a paper almost 2 years ago

FP8-LM: Training FP8 Large Language Models

Paper • 2310.18313 • Published Oct 27, 2023 • 33

liked a dataset almost 2 years ago

openbmb/UltraFeedback

Viewer • Updated Dec 29, 2023 • 64k • 2.23k • 380

Yixuan Wei

AI & ML interests

Recent Activity

Organizations

EasonWei's activity

Is it bitnet {-1,0,1}?