Tar Collection Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations • 10 items • Updated 10 days ago • 1
Running on Zero 10 10 Compressed Image Generation with Denoising Diffusion Codebook Models 📖 Generate compressed images given different input conditions
Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation Paper • 2508.05635 • Published 17 days ago • 72
ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents Paper • 2507.22827 • Published 25 days ago • 94
ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents Paper • 2507.22827 • Published 25 days ago • 94
ScreenCoder: Advancing Visual-to-Code Generation for Front-End Automation via Modular Multimodal Agents Paper • 2507.22827 • Published 25 days ago • 94 • 4