Upload folder using huggingface_hub

Browse files

Files changed (7) hide show

README.md +18 -49
config.json +2 -2
generation_config.json +1 -1
openvino_model.bin +2 -2
openvino_model.xml +0 -0
tokenizer.json +6 -16
tokenizer_config.json +2 -2

README.md CHANGED Viewed

@@ -1,34 +1,33 @@
 ---
 license: apache-2.0
-language:
-- en
 ---
-# Mistral-7b-Instruct-v0.1-int4-ov
- * Model creator: [Mistral AI](https://huggingface.co/mistralai)
- * Original model: [Mistral-7b-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
 ## Description
-This is [Mistral-7b-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) model converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT8 by [NNCF](https://github.com/openvinotoolkit/nncf).
 ## Quantization Parameters
 Weight compression was performed using `nncf.compress_weights` with the following parameters:
-* mode: **INT8_ASYM**
-For more information on quantization, check the [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/weight-compression.html)
 ## Compatibility
 The provided OpenVINO™ IR model is compatible with:
-* OpenVINO version 2024.2.0 and higher
 * Optimum Intel 1.16.0 and higher
-## Running Model Inference with [Optimum Intel](https://huggingface.co/docs/optimum/intel/index)
 1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
@@ -42,56 +41,26 @@ pip install optimum[openvino]
 from transformers import AutoTokenizer
 from optimum.intel.openvino import OVModelForCausalLM
-model_id = "OpenVINO/mistral-7b-instrcut-v0.1-int4-ov"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = OVModelForCausalLM.from_pretrained(model_id)
 inputs = tokenizer("What is OpenVINO?", return_tensors="pt")
-outputs = model.generate(inputs, max_new_tokens=20)
-print(tokenizer.decode(outputs[0], skip_special_tokens=True))
 ```
 For more examples and possible optimizations, refer to the [OpenVINO Large Language Model Inference Guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html).
-# Running Model Inference with [OpenVINO GenAI](https://github.com/openvinotoolkit/openvino.genai)
-1. Install packages required for using OpenVINO GenAI.
-```
-pip install openvino-genai huggingface_hub
-```
-2. Download model from HuggingFace Hub
-```
-import huggingface_hub as hf_hub
-model_id = "OpenVINO/mistral-7b-instrcut-v0.1-int4-ov"
-model_path = "mistral-7b-instrcut-v0.1-int4-ov"
-hf_hub.snapshot_download(model_id, local_dir=model_path)
-```
-3. Run model inference:
-```
-import openvino_genai as ov_genai
-device = "CPU"
-pipe = ov_genai.LLMPipeline(model_path, device)
-print(pipe.generate("What is OpenVINO?"))
-```
-More GenAI usage examples can be found in OpenVINO GenAI library [docs](https://github.com/openvinotoolkit/openvino.genai/blob/master/src/README.md) and [samples](https://github.com/openvinotoolkit/openvino.genai?tab=readme-ov-file#openvino-genai-samples)
 ## Limitations
-Check the original model card for [limitations](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1#limitations).
 ## Legal information
-The original model is distributed under [Apache 2.0](https://choosealicense.com/licenses/apache-2.0/) license. More details can be found in [original model card](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1).
 ## Disclaimer

 ---
 license: apache-2.0
+license_link: https://choosealicense.com/licenses/apache-2.0/
 ---
+# mistral-7b-instruct-v0.1-int4-ov
+* Model creator: [Mistralai](https://huggingface.co/mistralai)
+ * Original model: [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1)
 ## Description
+This is [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1) model converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT4 by [NNCF](https://github.com/openvinotoolkit/nncf).
 ## Quantization Parameters
 Weight compression was performed using `nncf.compress_weights` with the following parameters:
+* mode: **int4_asym**
+* ratio: **0.8**
+* group_size: **128**
+For more information on quantization, check the [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/weight-compression.html).
 ## Compatibility
 The provided OpenVINO™ IR model is compatible with:
+* OpenVINO version 2024.1.0 and higher
 * Optimum Intel 1.16.0 and higher
+## Running Model Inference
 1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
 from transformers import AutoTokenizer
 from optimum.intel.openvino import OVModelForCausalLM
+model_id = "OpenVINO/mistral-7b-instruct-v0.1-int4-ov"
 tokenizer = AutoTokenizer.from_pretrained(model_id)
 model = OVModelForCausalLM.from_pretrained(model_id)
 inputs = tokenizer("What is OpenVINO?", return_tensors="pt")
+outputs = model.generate(**inputs, max_length=200)
+text = tokenizer.batch_decode(outputs)[0]
+print(text)
 ```
 For more examples and possible optimizations, refer to the [OpenVINO Large Language Model Inference Guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html).
 ## Limitations
+Check the original model card for [limitations]().
 ## Legal information
+The original model is distributed under [apache-2.0](https://choosealicense.com/licenses/apache-2.0/) license. More details can be found in [original model card](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1).
 ## Disclaimer

config.json CHANGED Viewed

@@ -19,7 +19,7 @@
   "rope_theta": 10000.0,
   "sliding_window": 4096,
   "tie_word_embeddings": false,
-  "transformers_version": "4.40.1",
   "use_cache": true,
   "vocab_size": 32000
-}

   "rope_theta": 10000.0,
   "sliding_window": 4096,
   "tie_word_embeddings": false,
+  "transformers_version": "4.41.2",
   "use_cache": true,
   "vocab_size": 32000
+}

generation_config.json CHANGED Viewed

@@ -2,5 +2,5 @@
   "_from_model_config": true,
   "bos_token_id": 1,
   "eos_token_id": 2,
-  "transformers_version": "4.40.1"
 }

   "_from_model_config": true,
   "bos_token_id": 1,
   "eos_token_id": 2,
+  "transformers_version": "4.41.2"
 }

openvino_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f7290c3cea473ee6b893af9a3327329f8d3c9c056a922ff6a57ca083e7bb12d7
-size 7280406284

 version https://git-lfs.github.com/spec/v1
+oid sha256:99eedd1fa219d94465dc7874ee240d0f515b23709f90bf92ab3bf589f64d59f6
+size 4617228256

openvino_model.xml CHANGED Viewed

The diff for this file is too large to render. See raw diff

tokenizer.json CHANGED Viewed

@@ -31,23 +31,13 @@
       "special": true
     }
   ],
-  "normalizer": {
-    "type": "Sequence",
-    "normalizers": [
-      {
-        "type": "Prepend",
-        "prepend": "▁"
-      },
-      {
-        "type": "Replace",
-        "pattern": {
-          "String": " "
-        },
-        "content": "▁"
-      }
-    ]
   },
-  "pre_tokenizer": null,
   "post_processor": {
     "type": "TemplateProcessing",
     "single": [

       "special": true
     }
   ],
+  "normalizer": null,
+  "pre_tokenizer": {
+    "type": "Metaspace",
+    "replacement": "▁",
+    "prepend_scheme": "first",
+    "split": false
   },
   "post_processor": {
     "type": "TemplateProcessing",
     "single": [

tokenizer_config.json CHANGED Viewed

@@ -29,10 +29,10 @@
   },
   "additional_special_tokens": [],
   "bos_token": "<s>",
-  "chat_template": "{{ bos_token }}{% for message in messages %}{% if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}{{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }}{% endif %}{% if message['role'] == 'user' %}{{ '[INST] ' + message['content'] + ' [/INST]' }}{% elif message['role'] == 'assistant' %}{{ message['content'] + eos_token + ' ' }}{% else %}{{ raise_exception('Only user and assistant roles are supported!') }}{% endif %}{% endfor %}",
   "clean_up_tokenization_spaces": false,
   "eos_token": "</s>",
-  "legacy": true,
   "model_max_length": 1000000000000000019884624838656,
   "pad_token": null,
   "sp_model_kwargs": {},

   },
   "additional_special_tokens": [],
   "bos_token": "<s>",
+  "chat_template": "{%- if messages[0]['role'] == 'system' %}\n    {%- set system_message = messages[0]['content'] %}\n    {%- set loop_messages = messages[1:] %}\n{%- else %}\n    {%- set loop_messages = messages %}\n{%- endif %}\n\n{{- bos_token }}\n{%- for message in loop_messages %}\n    {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) %}\n        {{- raise_exception('After the optional system message, conversation roles must alternate user/assistant/user/assistant/...') }}\n    {%- endif %}\n    {%- if message['role'] == 'user' %}\n        {%- if loop.first and system_message is defined %}\n            {{- ' [INST] ' + system_message + '\\n\\n' + message['content'] + ' [/INST]' }}\n        {%- else %}\n            {{- ' [INST] ' + message['content'] + ' [/INST]' }}\n        {%- endif %}\n    {%- elif message['role'] == 'assistant' %}\n        {{- ' ' + message['content'] + eos_token}}\n    {%- else %}\n        {{- raise_exception('Only user and assistant roles are supported, with the exception of an initial optional system message!') }}\n    {%- endif %}\n{%- endfor %}\n",
   "clean_up_tokenization_spaces": false,
   "eos_token": "</s>",
+  "legacy": false,
   "model_max_length": 1000000000000000019884624838656,
   "pad_token": null,
   "sp_model_kwargs": {},