10 23 5

Jungang Li

Jungang

LJungang

AI & ML interests

None yet

Recent Activity

updated a collection 3 days ago

GUI-Agent Benchmark

commentedon a paper 6 days ago

PEARL: Personalized Streaming Video Understanding Model

upvoted a paper 6 days ago

PEARL: Personalized Streaming Video Understanding Model

View all activity

Organizations

upvoted a paper 6 days ago

PEARL: Personalized Streaming Video Understanding Model

Paper • 2603.20422 • Published 9 days ago • 38

upvoted a paper 8 days ago

Generation Models Know Space: Unleashing Implicit 3D Priors for Scene Understanding

Paper • 2603.19235 • Published 10 days ago • 93

upvoted a paper 9 days ago

Unlocking Multimodal Document Intelligence: From Current Triumphs to Future Frontiers of Visual Document Retrieval

Paper • 2602.19961 • Published Feb 23 • 2

upvoted a paper 10 days ago

AndroTMem: From Interaction Trajectories to Anchored Memory in Long-Horizon GUI Agents

Paper • 2603.18429 • Published 11 days ago • 26

upvoted a paper 11 days ago

Temporal Gains, Spatial Costs: Revisiting Video Fine-Tuning in Multimodal Large Language Models

Paper • 2603.17541 • Published 12 days ago • 20

upvoted a paper about 1 month ago

BPDQ: Bit-Plane Decomposition Quantization on a Variable Grid for Large Language Models

Paper • 2602.04163 • Published Feb 4 • 10

upvoted a paper about 2 months ago

OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

Paper • 2602.04804 • Published Feb 4 • 49

upvoted a paper 3 months ago

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

Paper • 2512.22905 • Published Dec 28, 2025 • 20

upvoted 2 papers 6 months ago

Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios

Paper • 2411.02708 • Published Nov 5, 2024 • 1

MOSS-ChatV: Reinforcement Learning with Process Reasoning Reward for Video Temporal Reasoning

Paper • 2509.21113 • Published Sep 25, 2025 • 6

upvoted a paper 7 months ago

Mind the Third Eye! Benchmarking Privacy Awareness in MLLM-powered Smartphone Agents

Paper • 2508.19493 • Published Aug 27, 2025 • 11

upvoted 5 papers 10 months ago

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

Paper • 2506.10857 • Published Jun 12, 2025 • 30

SAMA: Towards Multi-Turn Referential Grounded Video Chat with Large Language Models

Paper • 2505.18812 • Published May 24, 2025 • 2

VideoMark: A Distortion-Free Robust Watermarking Framework for Video Diffusion Models

Paper • 2504.16359 • Published Apr 23, 2025 • 3

Video World Models with Long-term Spatial Memory

Paper • 2506.05284 • Published Jun 5, 2025 • 56

AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs

Paper • 2506.05328 • Published Jun 5, 2025 • 21

upvoted a collection 10 months ago

Qwen3

Collection

84 items • Updated Dec 31, 2025 • 1.73k

upvoted 3 papers 10 months ago

Unveiling Instruction-Specific Neurons & Experts: An Analytical Framework for LLM's Instruction-Following Capabilities

Paper • 2505.21191 • Published May 27, 2025 • 3

PhyX: Does Your Model Have the "Wits" for Physical Reasoning?

Paper • 2505.15929 • Published May 21, 2025 • 49

SAVEn-Vid: Synergistic Audio-Visual Integration for Enhanced Understanding in Long Video Context

Paper • 2411.16213 • Published Nov 25, 2024 • 2

Jungang Li

AI & ML interests

Recent Activity

Organizations

Jungang's activity