ARC Lab, Tencent PCG
company
Verified
AI & ML interests
ARC mainly focuses on areas of computer vision, speech, and natural language processing, including speech/video generation, enhancement, retrieval, understanding, AutoML, etc. Considering research developments and industry trends, ARC consistently pursues exploration, innovation, and breakthroughs in technologies.
Recent Activity
Papers
TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
ARC-Chapter: Structuring Hour-Long Videos into Navigable Chapters and Hierarchical Summaries
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models
Autoregressive Long Video Diffusion in Real Time
Streamlining Cartoon Production with Generative Post-Keyframing
BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing
Inpainting-based image insturction editing
A 3.2 B text-to-image model distilled from flux
A Series of Powerful Visual Tokenizers
Let us create photos/paintings/avatars for anyone in any style within seconds.
TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
A customization method
Crafter series models for 3D reconstruction and generation
-
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
Paper • 2504.01016 • Published • 29 -
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
Paper • 2503.05638 • Published • 20 -
StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos
Paper • 2409.07447 • Published • 1 -
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Paper • 2409.02095 • Published • 37
Any-length Video Inpainting and Editing with Plug-and-Play Context Control
Retrieval-based manga sequence colorization
Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images
The smallest and most efficient control models for SDXL!
Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models
TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs
Autoregressive Long Video Diffusion in Real Time
A customization method
Streamlining Cartoon Production with Generative Post-Keyframing
Crafter series models for 3D reconstruction and generation
-
GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors
Paper • 2504.01016 • Published • 29 -
TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
Paper • 2503.05638 • Published • 20 -
StereoCrafter: Diffusion-based Generation of Long and High-fidelity Stereoscopic 3D from Monocular Videos
Paper • 2409.07447 • Published • 1 -
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Paper • 2409.02095 • Published • 37
BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing
Any-length Video Inpainting and Editing with Plug-and-Play Context Control
Inpainting-based image insturction editing
Retrieval-based manga sequence colorization
A 3.2 B text-to-image model distilled from flux
Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images
A Series of Powerful Visual Tokenizers
The smallest and most efficient control models for SDXL!
Let us create photos/paintings/avatars for anyone in any style within seconds.