Beyond Language Modeling: An Exploration of Multimodal Pretraining Paper • 2603.03276 • Published 2 days ago • 64
WildRayZer: Self-supervised Large View Synthesis in Dynamic Environments Paper • 2601.10716 • Published Jan 15 • 4
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders Paper • 2601.16208 • Published Jan 22 • 53
Bidirectional Normalizing Flow: From Data to Noise and Back Paper • 2512.10953 • Published Dec 11, 2025 • 7
Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation Paper • 2512.16913 • Published Dec 18, 2025 • 34
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published Dec 18, 2025 • 87
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published Oct 13, 2025 • 166
DiffusionPDE: Generative PDE-Solving Under Partial Observation Paper • 2406.17763 • Published Jun 25, 2024 • 24