yuzhaouoe (Yu Zhao)

upvoted a paper 2 months ago

OpenSIR: Open-Ended Self-Improving Reasoner

Paper • 2511.00602 • Published Nov 1, 2025 • 20

upvoted a paper 3 months ago

Learning GUI Grounding with Spatial Reasoning from Visual Feedback

Paper • 2509.21552 • Published Sep 25, 2025 • 11

upvoted 2 papers 6 months ago

Inverse Scaling in Test-Time Compute

Paper • 2507.14417 • Published Jul 19, 2025 • 27

NeuralOS: Towards Simulating Operating Systems via Neural Generative Models

Paper • 2507.08800 • Published Jul 11, 2025 • 80

upvoted a paper 7 months ago

Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem

Paper • 2506.03295 • Published Jun 3, 2025 • 17

upvoted a paper 8 months ago

MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly

Paper • 2505.10610 • Published May 15, 2025 • 54

upvoted an article 10 months ago

Article

🦸🏻#1: Open-endedness and AI Agents – A Path from Generative to Creative AI?

Dec 25, 2024

•

16

upvoted a paper 10 months ago

Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression

Paper • 2503.02812 • Published Mar 4, 2025 • 10

upvoted a collection 10 months ago

Q-Filters

Collection

Pre-computed Q-Filters for efficient KV cache compression. • 15 items • Updated Mar 3, 2025 • 7

upvoted a paper 11 months ago

Competitive Programming with Large Reasoning Models

Paper • 2502.06807 • Published Feb 3, 2025 • 68

upvoted 3 papers about 1 year ago

DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations

Paper • 2410.18860 • Published Oct 24, 2024 • 11

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

Paper • 2410.16090 • Published Oct 21, 2024 • 7

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Paper • 2410.15999 • Published Oct 21, 2024 • 20

upvoted a paper over 1 year ago

Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2

Paper • 2408.05147 • Published Aug 9, 2024 • 40

upvoted an article over 1 year ago

Article

Improving Hugging Face Training Efficiency Through Packing with Flash Attention 2

+4

Aug 21, 2024

•

42

upvoted a collection over 1 year ago

🔍 Interpretability & Analysis of LMs

Collection

Outstanding research in LM interpretability and evaluation, summarized • 135 items • Updated 17 days ago • 116

upvoted an article over 1 year ago

Article

Introducing RWKV - An RNN with the advantages of a transformer

+2

May 15, 2023

•

23

upvoted 3 papers over 1 year ago

Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation

Paper • 2406.13663 • Published Jun 19, 2024 • 7

A Simple and Effective L_2 Norm-Based Strategy for KV Cache Compression

Paper • 2406.11430 • Published Jun 17, 2024 • 25

Are We Done with MMLU?

Paper • 2406.04127 • Published Jun 6, 2024 • 39

Yu Zhao

AI & ML interests

Organizations

OpenSIR: Open-Ended Self-Improving Reasoner

Learning GUI Grounding with Spatial Reasoning from Visual Feedback

Inverse Scaling in Test-Time Compute

NeuralOS: Towards Simulating Operating Systems via Neural Generative Models

Unleashing the Reasoning Potential of Pre-trained LLMs by Critique Fine-Tuning on One Problem

MMLongBench: Benchmarking Long-Context Vision-Language Models Effectively and Thoroughly

🦸🏻#1: Open-endedness and AI Agents – A Path from Generative to Creative AI?

Q-Filters: Leveraging QK Geometry for Efficient KV Cache Compression

Q-Filters

Competitive Programming with Large Reasoning Models

DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations

Analysing the Residual Stream of Language Models Under Knowledge Conflicts

Steering Knowledge Selection Behaviours in LLMs via SAE-Based Representation Engineering

Gemma Scope: Open Sparse Autoencoders Everywhere All At Once on Gemma 2

Improving Hugging Face Training Efficiency Through Packing with Flash Attention 2

🔍 Interpretability & Analysis of LMs

Introducing RWKV - An RNN with the advantages of a transformer

Model Internals-based Answer Attribution for Trustworthy Retrieval-Augmented Generation

A Simple and Effective L_2 Norm-Based Strategy for KV Cache Compression

Are We Done with MMLU?

Yu Zhao

AI & ML interests

Organizations

yuzhaouoe's activity

🦸🏻#1: Open-endedness and AI Agents – A Path from Generative to Creative AI?

Improving Hugging Face Training Efficiency Through Packing with Flash Attention 2

Introducing RWKV - An RNN with the advantages of a transformer

🎉 Free Image Generator Now Available!