AI & ML interests
None defined yet.
Recent Activity
Papers
Making Dialogue Grounding Data Rich: A Three-Tier Data Synthesis Framework for Generalized Referring Expression Comprehension
Seeing the Forest and the Trees: Query-Aware Tokenizer for Long-Video Multimodal Language Models
Hey 👋! Welcome to our team's corner at HuggingFace! We're a bunch of enthusiastic folks who are totally into the exciting world of Multimodal Large Language Models.
Our research explores innovative ways to enhance interactions between language and Image/Vidio/Audio, aiming to advance the capabilities of AI in understanding and generating multimodal content.
We're a curious bunch, always on the lookout for cool ways to make AI systems understand and generate human-like language.