Shengyi Qian's picture

2 1 3

Shengyi Qian

shengyi-qian

·

https://jasonqsy.github.io/

AI & ML interests

AI Agents, Vision Language Model

Recent Activity

updated a model about 2 months ago

MoeReward/rl_checkpoints

updated a model about 2 months ago

MoeReward/rl_checkpoints

published a model 5 months ago

MoeReward/rl_checkpoints

View all activity

Organizations

authored 2 papers about 1 year ago

Multi-Object Hallucination in Vision-Language Models

Paper • 2407.06192 • Published Jul 8, 2024 • 12

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

Paper • 2406.05132 • Published Jun 7, 2024 • 31

authored a paper almost 2 years ago

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

Paper • 2309.12311 • Published Sep 21, 2023 • 17

authored 2 papers about 2 years ago

Understanding 3D Object Articulation in Internet Videos

Paper • 2203.16531 • Published Mar 30, 2022

Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation

Paper • 2303.11329 • Published Mar 20, 2023 • 1

authored a paper over 2 years ago

Understanding 3D Object Interaction from a Single Image

Paper • 2305.09664 • Published May 16, 2023 • 2