2 76 14

Pham Van Linh

phamvanlinh143

AI & ML interests

OCR, AI, DL

Recent Activity

liked a model 1 day ago

datalab-to/chandra

upvoted a paper 3 days ago

Reinforcement Learning via Self-Distillation

upvoted a paper 3 days ago

Efficient Memory Management for Large Language Model Serving with PagedAttention

View all activity

Organizations

None yet

liked a model 1 day ago

datalab-to/chandra

Image-to-Text • 9B • Updated Oct 21, 2025 • 496k • 470

upvoted 3 papers 3 days ago

Reinforcement Learning via Self-Distillation

Paper • 2601.20802 • Published 6 days ago • 35

Efficient Memory Management for Large Language Model Serving with PagedAttention

Paper • 2309.06180 • Published Sep 12, 2023 • 34

SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14, 2025 • 138

upvoted an article 3 days ago

Article

Performant local mixture-of-experts CPU inference with GPU acceleration in llama.cpp

5 days ago

•

liked a model 3 days ago

nvidia/NVIDIA-Nemotron-Parse-v1.1

Image-Text-to-Text • Updated 7 days ago • 113k • 137

upvoted an article 9 days ago

Article

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family

15 days ago

•

upvoted 4 articles about 1 month ago

Article

The Optimal Architecture for Small Language Models

Dec 26, 2025

•

113

Article

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

Dec 18, 2025

•

119

Article

Shrinking Giants: The Quantization Mathematics Making LLMs Accessible

May 3, 2025

•

Article

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

Aug 17, 2022

•

123

liked 2 Spaces about 1 month ago

The Smol Training Playbook

📚

2.95k

The secrets to building world-class LLMs

The Ultra-Scale Playbook

🌌

3.67k

The ultimate guide to training LLM on large GPU Clusters

upvoted 3 articles about 2 months ago

Article

Everything You Need to Know about Knowledge Distillation

Mar 6, 2025

•

Article

Mastering Tensor Dimensions in Transformers

Jan 12, 2025

•

132

Article

Understanding BigBird's Block Sparse Attention

Mar 31, 2021

•

upvoted 2 articles 2 months ago

Article

Transformers Are Getting Old: Variants and Alternatives Exist!

Jul 5, 2025

•

Article

Design choices for Vision Language Models in 2024

Apr 16, 2024

•

upvoted 2 collections 2 months ago

ByteDance Papers

Collection

ByteDance papers collection • 138 items • Updated 2 days ago • 26

Deepseek Papers

Collection

Deepseek papers collection • 29 items • Updated 2 days ago • 319

Pham Van Linh

AI & ML interests

Recent Activity

Organizations

phamvanlinh143's activity

Performant local mixture-of-experts CPU inference with GPU acceleration in llama.cpp

LightOnOCR-2-1B: a lightweight high-performance end-to-end OCR model family

The Optimal Architecture for Small Language Models

Tokenization in Transformers v5: Simpler, Clearer, and More Modular

Shrinking Giants: The Quantization Mathematics Making LLMs Accessible

A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes

The Smol Training Playbook

The Ultra-Scale Playbook

Everything You Need to Know about Knowledge Distillation

Mastering Tensor Dimensions in Transformers

Understanding BigBird's Block Sparse Attention

Transformers Are Getting Old: Variants and Alternatives Exist!

Design choices for Vision Language Models in 2024

🎉 Free Image Generator Now Available!