Set pipeline tag to image-classification and add code link (#1)
Browse files- Set pipeline tag to image-classification and add code link (585e5761059c378e13dc562f32e1ca8571161e7c)
Co-authored-by: Niels Rogge <[email protected]>
    	
        README.md
    CHANGED
    
    | @@ -1,13 +1,13 @@ | |
| 1 | 
             
            ---
         | 
|  | |
|  | |
| 2 | 
             
            license: other
         | 
| 3 | 
             
            license_name: nvclv1
         | 
| 4 | 
             
            license_link: LICENSE
         | 
| 5 | 
            -
             | 
| 6 | 
            -
             | 
| 7 | 
            -
            pipeline_tag: image-feature-extraction
         | 
| 8 | 
             
            ---
         | 
| 9 |  | 
| 10 | 
            -
             | 
| 11 | 
             
            [**MambaVision: A Hybrid Mamba-Transformer Vision Backbone**](https://arxiv.org/abs/2407.08083).
         | 
| 12 |  | 
| 13 | 
             
            ## Model Overview
         | 
| @@ -37,7 +37,6 @@ MambaVision-B-21K is pretrained on ImageNet-21K dataset and finetuned on ImageNe | |
| 37 | 
             
                <td>224x224</td>
         | 
| 38 | 
             
            </tr>
         | 
| 39 |  | 
| 40 | 
            -
             | 
| 41 | 
             
            </table>
         | 
| 42 |  | 
| 43 | 
             
            In addition, the MambaVision models demonstrate a strong performance by achieving a new SOTA Pareto-front in
         | 
| @@ -48,11 +47,11 @@ terms of Top-1 accuracy and throughput. | |
| 48 | 
             
            class="center">
         | 
| 49 | 
             
            </p>
         | 
| 50 |  | 
| 51 | 
            -
             | 
| 52 | 
             
            ## Model Usage
         | 
| 53 |  | 
| 54 | 
             
            It is highly recommended to install the requirements for MambaVision by running the following:
         | 
| 55 |  | 
|  | |
| 56 |  | 
| 57 | 
             
            ```Bash
         | 
| 58 | 
             
            pip install mambavision
         | 
| @@ -66,13 +65,11 @@ In the following example, we demonstrate how MambaVision can be used for image c | |
| 66 |  | 
| 67 | 
             
            Given the following image from [COCO dataset](https://cocodataset.org/#home)  val set as an input:
         | 
| 68 |  | 
| 69 | 
            -
             | 
| 70 | 
             
            <p align="center">
         | 
| 71 | 
             
            <img src="https://cdn-uploads.huggingface.co/production/uploads/64414b62603214724ebd2636/4duSnqLf4lrNiAHczSmAN.jpeg" width=70% height=70% 
         | 
| 72 | 
             
            class="center">
         | 
| 73 | 
             
            </p>
         | 
| 74 |  | 
| 75 | 
            -
             | 
| 76 | 
             
            The following snippet can be used for image classification:
         | 
| 77 |  | 
| 78 | 
             
            ```Python
         | 
| @@ -136,7 +133,7 @@ transform = create_transform(input_size=input_resolution, | |
| 136 | 
             
                                         is_training=False,
         | 
| 137 | 
             
                                         mean=model.config.mean,
         | 
| 138 | 
             
                                         std=model.config.std,
         | 
| 139 | 
            -
                                         crop_mode=model.config. | 
| 140 | 
             
                                         crop_pct=model.config.crop_pct)
         | 
| 141 | 
             
            inputs = transform(image).unsqueeze(0).cuda()
         | 
| 142 | 
             
            # model inference
         | 
| @@ -147,7 +144,6 @@ print("Size of extracted features in stage 1:", features[0].size()) # torch.Size | |
| 147 | 
             
            print("Size of extracted features in stage 4:", features[3].size()) # torch.Size([1, 640, 7, 7])
         | 
| 148 | 
             
            ```
         | 
| 149 |  | 
| 150 | 
            -
             | 
| 151 | 
             
            ### License: 
         | 
| 152 |  | 
| 153 | 
             
            [NVIDIA Source Code License-NC](https://huggingface.co/nvidia/MambaVision-B-21K/blob/main/LICENSE)
         | 
|  | |
| 1 | 
             
            ---
         | 
| 2 | 
            +
            datasets:
         | 
| 3 | 
            +
            - ILSVRC/imagenet-21k
         | 
| 4 | 
             
            license: other
         | 
| 5 | 
             
            license_name: nvclv1
         | 
| 6 | 
             
            license_link: LICENSE
         | 
| 7 | 
            +
            pipeline_tag: image-classification
         | 
| 8 | 
            +
            library_name: transformers
         | 
|  | |
| 9 | 
             
            ---
         | 
| 10 |  | 
|  | |
| 11 | 
             
            [**MambaVision: A Hybrid Mamba-Transformer Vision Backbone**](https://arxiv.org/abs/2407.08083).
         | 
| 12 |  | 
| 13 | 
             
            ## Model Overview
         | 
|  | |
| 37 | 
             
                <td>224x224</td>
         | 
| 38 | 
             
            </tr>
         | 
| 39 |  | 
|  | |
| 40 | 
             
            </table>
         | 
| 41 |  | 
| 42 | 
             
            In addition, the MambaVision models demonstrate a strong performance by achieving a new SOTA Pareto-front in
         | 
|  | |
| 47 | 
             
            class="center">
         | 
| 48 | 
             
            </p>
         | 
| 49 |  | 
|  | |
| 50 | 
             
            ## Model Usage
         | 
| 51 |  | 
| 52 | 
             
            It is highly recommended to install the requirements for MambaVision by running the following:
         | 
| 53 |  | 
| 54 | 
            +
            Code: https://github.com/NVlabs/MambaVision
         | 
| 55 |  | 
| 56 | 
             
            ```Bash
         | 
| 57 | 
             
            pip install mambavision
         | 
|  | |
| 65 |  | 
| 66 | 
             
            Given the following image from [COCO dataset](https://cocodataset.org/#home)  val set as an input:
         | 
| 67 |  | 
|  | |
| 68 | 
             
            <p align="center">
         | 
| 69 | 
             
            <img src="https://cdn-uploads.huggingface.co/production/uploads/64414b62603214724ebd2636/4duSnqLf4lrNiAHczSmAN.jpeg" width=70% height=70% 
         | 
| 70 | 
             
            class="center">
         | 
| 71 | 
             
            </p>
         | 
| 72 |  | 
|  | |
| 73 | 
             
            The following snippet can be used for image classification:
         | 
| 74 |  | 
| 75 | 
             
            ```Python
         | 
|  | |
| 133 | 
             
                                         is_training=False,
         | 
| 134 | 
             
                                         mean=model.config.mean,
         | 
| 135 | 
             
                                         std=model.config.std,
         | 
| 136 | 
            +
                                         crop_mode=model.config.crop_pct,
         | 
| 137 | 
             
                                         crop_pct=model.config.crop_pct)
         | 
| 138 | 
             
            inputs = transform(image).unsqueeze(0).cuda()
         | 
| 139 | 
             
            # model inference
         | 
|  | |
| 144 | 
             
            print("Size of extracted features in stage 4:", features[3].size()) # torch.Size([1, 640, 7, 7])
         | 
| 145 | 
             
            ```
         | 
| 146 |  | 
|  | |
| 147 | 
             
            ### License: 
         | 
| 148 |  | 
| 149 | 
             
            [NVIDIA Source Code License-NC](https://huggingface.co/nvidia/MambaVision-B-21K/blob/main/LICENSE)
         | 

 
		