SparseFormer: Sparse Visual Recognition via Limited Latent Tokens Paper • 2304.03768 • Published Apr 7, 2023
Bootstrapping SparseFormers from Vision Foundation Models Paper • 2312.01987 • Published Dec 4, 2023 • 1
Not All Patches are What You Need: Expediting Vision Transformers via Token Reorganizations Paper • 2202.07800 • Published Feb 16, 2022
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training Paper • 2203.12602 • Published Mar 23, 2022 • 1
VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking Paper • 2303.16727 • Published Mar 29, 2023
AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition Paper • 2205.13535 • Published May 26, 2022