Sejung Son

SAISON17

AI & ML interests

None yet

Recent Activity

upvoted an article 7 days ago

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

upvoted an article 7 days ago

KV Caching Explained: Optimizing Transformer Inference Efficiency

upvoted an article 7 days ago

ChatML vs Harmony: Understanding the new Format from OpenAI 🔍

View all activity

Organizations

upvoted 3 articles 7 days ago

Article

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

•

16 days ago

• 18

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

•

Jan 30

• 117

Article

ChatML vs Harmony: Understanding the new Format from OpenAI 🔍

•

15 days ago

• 27

liked a dataset 14 days ago

allganize/IFEval-Ko

Viewer • Updated Jun 9 • 342 • 685 • 10

liked a model 19 days ago

openai/gpt-oss-120b

Text Generation • 120B • Updated 11 days ago • 1.59M • • 3.57k

liked a dataset 23 days ago

gaia-benchmark/GAIA

Updated Feb 13 • 11.2k • 422

upvoted an article 24 days ago

Article

Decoding Strategies in Large Language Models

•

Oct 29, 2024

• 77

upvoted an article about 1 month ago

Article

Mixture of Experts Explained

and 5 others •

Dec 11, 2023

• 846

liked a model about 1 month ago

moonshotai/Kimi-K2-Instruct

Text Generation • Updated 14 days ago • 425k • • 2.11k

upvoted an article about 1 month ago

Article

Assisted Generation: a new direction toward low-latency text generation

•

May 11, 2023

• 70

liked 4 datasets 3 months ago

liked a model 3 months ago

Tongyi-Zhiwen/QwenLong-L1-32B

Text Generation • 33B • Updated Jun 9 • 2.36k • • 158

liked a dataset 3 months ago

KRX-Data/Won-Instruct

Viewer • Updated Apr 18 • 86k • 450 • 23

liked 3 models 3 months ago

KRX-Data/WON-Reasoning

8B • Updated Apr 11 • 162 • 12

BAAI/bge-m3

intfloat/multilingual-e5-large

Feature Extraction • 0.6B • Updated Feb 17 • 2.93M • • 1.02k

upvoted an article 4 months ago

Article

Fine-tuning Llama 2 70B using PyTorch FSDP

and 3 others •

Sep 13, 2023

• 29

Sejung Son

AI & ML interests

Recent Activity

Organizations

SAISON17's activity

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware

KV Caching Explained: Optimizing Transformer Inference Efficiency

ChatML vs Harmony: Understanding the new Format from OpenAI 🔍

Decoding Strategies in Large Language Models

Mixture of Experts Explained

Assisted Generation: a new direction toward low-latency text generation

Fine-tuning Llama 2 70B using PyTorch FSDP