Update README.md
Browse files
    	
        README.md
    CHANGED
    
    | @@ -8,64 +8,65 @@ license: other | |
| 8 | 
             
            # Model Overview
         | 
| 9 | 
             
            This is a multilingual text classification model that can enable data annotation, creation of domain-specific blends and the addition of metadata tags. The model classifies documents into one of 26 domain classes:
         | 
| 10 |  | 
| 11 | 
            -
            'Adult', 'Arts_and_Entertainment', 'Autos_and_Vehicles', 'Beauty_and_Fitness', 'Books_and_Literature', 'Business_and_Industrial', 'Computers_and_Electronics', 'Finance', 'Food_and_Drink', 'Games', 'Health', 'Hobbies_and_Leisure', 'Home_and_Garden', 'Internet_and_Telecom', 'Jobs_and_Education', 'Law_and_Government', 'News', 'Online_Communities', 'People_and_Society', 'Pets_and_Animals', 'Real_Estate', 'Science', 'Sensitive_Subjects', 'Shopping', 'Sports', 'Travel_and_Transportation'
         | 
| 12 | 
            -
             | 
| 13 | 
            -
            It supports 52 languages (English and 51 other languages) : 'ar', 'az', 'bg', 'bn', 'ca', 'cs', 'da', 'de', 'el', 'es', 'et', 'fa', 'fi', 'fr', 'gl', 'he', 'hi', 'hr', 'hu', 'hy', 'id', 'is', 'it', 'ka', 'kk', 'kn', 'ko', 'lt', 'lv', 'mk', 'ml', 'mr', 'ne', 'nl', 'no', 'pl', 'pt', 'ro', 'ru', 'sk', 'sl', 'sq', 'sr', 'sv', 'ta', 'tr', 'uk', 'ur', 'vi', 'ja', 'zh'
         | 
| 14 | 
             
            ```
         | 
| 15 | 
            -
             | 
| 16 | 
            -
            ar	Arabic
         | 
| 17 | 
            -
            az	Azerbaijani
         | 
| 18 | 
            -
            bg	Bulgarian
         | 
| 19 | 
            -
            bn	Bengali
         | 
| 20 | 
            -
            ca	Catalan
         | 
| 21 | 
            -
            cs	Czech
         | 
| 22 | 
            -
            da	Danish
         | 
| 23 | 
            -
            de	German
         | 
| 24 | 
            -
            el	Greek
         | 
| 25 | 
            -
            es	Spanish
         | 
| 26 | 
            -
            et	Estonian
         | 
| 27 | 
            -
            fa	Persian
         | 
| 28 | 
            -
            fi	Finnish
         | 
| 29 | 
            -
            fr	French
         | 
| 30 | 
            -
            gl	Galician
         | 
| 31 | 
            -
            he	Hebrew
         | 
| 32 | 
            -
            hi	Hindi
         | 
| 33 | 
            -
            hr	Croatian
         | 
| 34 | 
            -
            hu	Hungarian
         | 
| 35 | 
            -
            hy	Armenian
         | 
| 36 | 
            -
            id	Indonesian
         | 
| 37 | 
            -
            is	Icelandic
         | 
| 38 | 
            -
            it	Italian
         | 
| 39 | 
            -
            ka	Georgian
         | 
| 40 | 
            -
            kk	Kazakh
         | 
| 41 | 
            -
            kn	Kannada
         | 
| 42 | 
            -
            ko	Korean
         | 
| 43 | 
            -
            lt	Lithuanian
         | 
| 44 | 
            -
            lv	Latvian
         | 
| 45 | 
            -
            mk	Macedonian
         | 
| 46 | 
            -
            ml	Malayalam
         | 
| 47 | 
            -
            mr	Marathi
         | 
| 48 | 
            -
            ne	Nepali
         | 
| 49 | 
            -
            nl	Dutch
         | 
| 50 | 
            -
            no	Norwegian
         | 
| 51 | 
            -
            pl	Polish
         | 
| 52 | 
            -
            pt	Portuguese
         | 
| 53 | 
            -
            ro	Romanian
         | 
| 54 | 
            -
            ru	Russian
         | 
| 55 | 
            -
            sk	Slovak
         | 
| 56 | 
            -
            sl	Slovenian
         | 
| 57 | 
            -
            sq	Albanian
         | 
| 58 | 
            -
            sr	Serbian
         | 
| 59 | 
            -
            sv	Swedish
         | 
| 60 | 
            -
            ta	Tamil
         | 
| 61 | 
            -
            tr	Turkish
         | 
| 62 | 
            -
            uk	Ukrainian
         | 
| 63 | 
            -
            ur	Urdu
         | 
| 64 | 
            -
            vi	Vietnamese
         | 
| 65 | 
            -
            ja	Japanese
         | 
| 66 | 
            -
            zh	Chinese
         | 
| 67 | 
             
            ```
         | 
| 68 |  | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 69 | 
             
            # License
         | 
| 70 | 
             
            This model is released under the [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf).
         | 
| 71 |  | 
| @@ -126,6 +127,9 @@ Arts_and_Entertainment | |
| 126 | 
             
            ## Evaluation
         | 
| 127 | 
             
            - Metric: PR-AUC
         | 
| 128 |  | 
|  | |
|  | |
|  | |
| 129 | 
             
            # Inference
         | 
| 130 | 
             
            - Engine: PyTorch
         | 
| 131 | 
             
            - Test Hardware: V100
         | 
|  | |
| 8 | 
             
            # Model Overview
         | 
| 9 | 
             
            This is a multilingual text classification model that can enable data annotation, creation of domain-specific blends and the addition of metadata tags. The model classifies documents into one of 26 domain classes:
         | 
| 10 |  | 
|  | |
|  | |
|  | |
| 11 | 
             
            ```
         | 
| 12 | 
            +
            'Adult', 'Arts_and_Entertainment', 'Autos_and_Vehicles', 'Beauty_and_Fitness', 'Books_and_Literature', 'Business_and_Industrial', 'Computers_and_Electronics', 'Finance', 'Food_and_Drink', 'Games', 'Health', 'Hobbies_and_Leisure', 'Home_and_Garden', 'Internet_and_Telecom', 'Jobs_and_Education', 'Law_and_Government', 'News', 'Online_Communities', 'People_and_Society', 'Pets_and_Animals', 'Real_Estate', 'Science', 'Sensitive_Subjects', 'Shopping', 'Sports', 'Travel_and_Transportation'
         | 
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
|  | |
| 13 | 
             
            ```
         | 
| 14 |  | 
| 15 | 
            +
            It supports 52 languages (English and 51 other languages):
         | 
| 16 | 
            +
            | Code | Language Name  |
         | 
| 17 | 
            +
            |------|----------------|
         | 
| 18 | 
            +
            | ar   | Arabic         |
         | 
| 19 | 
            +
            | az   | Azerbaijani    |
         | 
| 20 | 
            +
            | bg   | Bulgarian      |
         | 
| 21 | 
            +
            | bn   | Bengali        |
         | 
| 22 | 
            +
            | ca   | Catalan        |
         | 
| 23 | 
            +
            | cs   | Czech          |
         | 
| 24 | 
            +
            | da   | Danish         |
         | 
| 25 | 
            +
            | de   | German         |
         | 
| 26 | 
            +
            | el   | Greek          |
         | 
| 27 | 
            +
            | es   | Spanish        |
         | 
| 28 | 
            +
            | et   | Estonian       |
         | 
| 29 | 
            +
            | fa   | Persian        |
         | 
| 30 | 
            +
            | fi   | Finnish        |
         | 
| 31 | 
            +
            | fr   | French         |
         | 
| 32 | 
            +
            | gl   | Galician       |
         | 
| 33 | 
            +
            | he   | Hebrew         |
         | 
| 34 | 
            +
            | hi   | Hindi          |
         | 
| 35 | 
            +
            | hr   | Croatian       |
         | 
| 36 | 
            +
            | hu   | Hungarian      |
         | 
| 37 | 
            +
            | hy   | Armenian       |
         | 
| 38 | 
            +
            | id   | Indonesian     |
         | 
| 39 | 
            +
            | is   | Icelandic      |
         | 
| 40 | 
            +
            | it   | Italian        |
         | 
| 41 | 
            +
            | ka   | Georgian       |
         | 
| 42 | 
            +
            | kk   | Kazakh         |
         | 
| 43 | 
            +
            | kn   | Kannada        |
         | 
| 44 | 
            +
            | ko   | Korean         |
         | 
| 45 | 
            +
            | lt   | Lithuanian     |
         | 
| 46 | 
            +
            | lv   | Latvian        |
         | 
| 47 | 
            +
            | mk   | Macedonian     |
         | 
| 48 | 
            +
            | ml   | Malayalam      |
         | 
| 49 | 
            +
            | mr   | Marathi        |
         | 
| 50 | 
            +
            | ne   | Nepali         |
         | 
| 51 | 
            +
            | nl   | Dutch          |
         | 
| 52 | 
            +
            | no   | Norwegian      |
         | 
| 53 | 
            +
            | pl   | Polish         |
         | 
| 54 | 
            +
            | pt   | Portuguese     |
         | 
| 55 | 
            +
            | ro   | Romanian       |
         | 
| 56 | 
            +
            | ru   | Russian        |
         | 
| 57 | 
            +
            | sk   | Slovak         |
         | 
| 58 | 
            +
            | sl   | Slovenian      |
         | 
| 59 | 
            +
            | sq   | Albanian       |
         | 
| 60 | 
            +
            | sr   | Serbian        |
         | 
| 61 | 
            +
            | sv   | Swedish        |
         | 
| 62 | 
            +
            | ta   | Tamil          |
         | 
| 63 | 
            +
            | tr   | Turkish        |
         | 
| 64 | 
            +
            | uk   | Ukrainian      |
         | 
| 65 | 
            +
            | ur   | Urdu           |
         | 
| 66 | 
            +
            | vi   | Vietnamese     |
         | 
| 67 | 
            +
            | ja   | Japanese       |
         | 
| 68 | 
            +
            | zh   | Chinese        |
         | 
| 69 | 
            +
             | 
| 70 | 
             
            # License
         | 
| 71 | 
             
            This model is released under the [NVIDIA Open Model License Agreement](https://developer.download.nvidia.com/licenses/nvidia-open-model-license-agreement-june-2024.pdf).
         | 
| 72 |  | 
|  | |
| 127 | 
             
            ## Evaluation
         | 
| 128 | 
             
            - Metric: PR-AUC
         | 
| 129 |  | 
| 130 | 
            +
            PR-AUC by language:
         | 
| 131 | 
            +
            <img src="https://huggingface.co/nvidia/multilingual-domain-classifier/resolve/main/pr_auc_by_language.PNG" alt="pr_auc_by_language" style="width:750px;">
         | 
| 132 | 
            +
             | 
| 133 | 
             
            # Inference
         | 
| 134 | 
             
            - Engine: PyTorch
         | 
| 135 | 
             
            - Test Hardware: V100
         | 

