Can MLLMs Understand the Deep Implication Behind Chinese Images? Paper • 2410.13854 • Published Oct 17, 2024 • 12
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models Paper • 2401.13919 • Published Jan 25, 2024 • 32