bweng commited on
Commit
aecbefd
·
verified ·
1 Parent(s): 419eab0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +120 -5
README.md CHANGED
@@ -1,8 +1,123 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: mit
3
- base_model:
4
- - microsoft/Phi-4-mini-instruct
5
  tags:
6
- - openvino
7
- - phi4
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - multilingual
4
+ - ar
5
+ - zh
6
+ - cs
7
+ - da
8
+ - nl
9
+ - en
10
+ - fi
11
+ - fr
12
+ - de
13
+ - he
14
+ - hu
15
+ - it
16
+ - ja
17
+ - ko
18
+ - 'no'
19
+ - pl
20
+ - pt
21
+ - ru
22
+ - es
23
+ - sv
24
+ - th
25
+ - tr
26
+ - uk
27
  license: mit
28
+ license_link: https://huggingface.co/microsoft/Phi-4-mini-instruct/resolve/main/LICENSE
29
+ pipeline_tag: text-generation
30
  tags:
31
+ - nlp
32
+ - code
33
+ base_model: microsoft/Phi-4-mini-instruct
34
+ base_model_relation: quantized
35
+
36
+ ---
37
+ # Phi-4-mini-instruct-int4-ov
38
+ * Model creator: [Microsoft](https://huggingface.co/microsoft)
39
+ * Original model: [Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct)
40
+
41
+ ## Description
42
+
43
+ This is [Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct) model converted to the [OpenVINO™ IR](https://docs.openvino.ai/2025/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT4 by [NNCF](https://github.com/openvinotoolkit/nncf).
44
+
45
+ ## Quantization Parameters
46
+
47
+ Weight compression was performed using `nncf.compress_weights` with the following parameters:
48
+
49
+ * mode: **INT4_ASYM**
50
+ * ratio: **1.0**
51
+ * group_size: **64**
52
+ * awq: **True**
53
+ * scale_estimation: **True**
54
+ * dataset: [wikitext2](https://huggingface.co/datasets/mindchain/wikitext2)
55
+
56
+ For more information on quantization, check the [OpenVINO model optimization guide](https://docs.openvino.ai/2025/openvino-workflow/model-optimization-guide/weight-compression.html)
57
+
58
+ ## Compatibility
59
+ The provided OpenVINO™ IR model is compatible with:
60
+ * OpenVINO version 2025.1.0 and higher
61
+ * Optimum Intel 1.22.0 and higher
62
+
63
+ ## Running Model Inference with [Optimum Intel](https://huggingface.co/docs/optimum/intel/index)
64
+
65
+ 1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
66
+ ```
67
+ pip install optimum[openvino]
68
+ ```
69
+
70
+ 2. Run model inference:
71
+ ```
72
+ from transformers import AutoTokenizer
73
+ from optimum.intel.openvino import OVModelForCausalLM
74
+ model_id = "OpenVINO/Phi-4-mini-instruct-int4-ov"
75
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
76
+ model = OVModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)
77
+
78
+ inputs = tokenizer("What is OpenVINO?", return_tensors="pt")
79
+ outputs = model.generate(**inputs, max_length=200)
80
+ text = tokenizer.batch_decode(outputs)[0]
81
+ print(text)
82
+ ```
83
+ For more examples and possible optimizations, refer to [the Inference with Optimum Intel](https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-optimum-intel.html).
84
+
85
+ ## Running Model Inference with [OpenVINO GenAI](https://github.com/openvinotoolkit/openvino.genai)
86
+
87
+ 1. Install packages required for using OpenVINO GenAI.
88
+ ```
89
+ pip install -U openvino openvino-tokenizers openvino-genai
90
+ pip install huggingface_hub
91
+ ```
92
+
93
+ 2. Download model from HuggingFace Hub
94
+
95
+ ```
96
+ import huggingface_hub as hf_hub
97
+ model_id = "OpenVINO/Phi-4-mini-instruct-int4-ov"
98
+ model_path = "Phi-4-mini-instruct-int4-ov"
99
+ hf_hub.snapshot_download(model_id, local_dir=model_path)
100
+ ```
101
+
102
+ 3. Run model inference:
103
+ ```
104
+ import openvino_genai as ov_genai
105
+ device = "CPU"
106
+ pipe = ov_genai.LLMPipeline(model_path, device)
107
+ print(pipe.generate("What is OpenVINO?", max_length=200))
108
+ ```
109
+ More GenAI usage examples can be found in OpenVINO GenAI library [docs](https://docs.openvino.ai/2025/openvino-workflow-generative/inference-with-genai.html) and [samples](https://github.com/openvinotoolkit/openvino.genai?tab=readme-ov-file#openvino-genai-samples)
110
+
111
+ You can find more detaild usage examples in OpenVINO Notebooks:
112
+
113
+ - [LLM](https://openvinotoolkit.github.io/openvino_notebooks/?search=LLM)
114
+ - [RAG text generation](https://openvinotoolkit.github.io/openvino_notebooks/?search=RAG+system&tasks=Text+Generation)
115
+
116
+ ## Limitations
117
+ Check the original model card for [original model card](ttps://huggingface.co/microsoft/Phi-4-mini-instruct) for limitations.
118
+
119
+ ## Legal information
120
+ The original model is distributed under [mit](https://huggingface.co/microsoft/Phi-4-mini-instruct/resolve/main/LICENSE) license. More details can be found in [original model card](ttps://huggingface.co/microsoft/Phi-4-mini-instruct).
121
+
122
+ ## Disclaimer
123
+ Intel is committed to respecting human rights and avoiding causing or contributing to adverse impacts on human rights. See [Intel’s Global Human Rights Principles](https://www.intel.com/content/dam/www/central-libraries/us/en/documents/policy-human-rights.pdf). Intel’s products and software are intended only to be used in applications that do not cause or contribute to adverse impacts on human rights.