Update README.md
Browse files
README.md
CHANGED
@@ -14,17 +14,46 @@ Optimized models are published here in [ONNX](https://onnx.ai) format to run wit
|
|
14 |
|
15 |
To easily get started with the model, you can use our ONNX Runtime Generate() API. See instructions [here](https://github.com/microsoft/onnxruntime/blob/gh-pages/docs/genai/tutorials/deepseek-python.md)
|
16 |
|
|
|
|
|
17 |
```bash
|
18 |
-
# Download the model directly using the
|
19 |
-
huggingface-cli download onnxruntime/DeepSeek-R1-Distill-ONNX --include 'deepseek-r1-distill-qwen-1.5B/*' --local-dir .
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
20 |
```
|
21 |
|
|
|
|
|
22 |
```bash
|
23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
24 |
|
25 |
-
#
|
26 |
curl -o https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/main/examples/python/model-chat.py
|
27 |
-
python model-chat.py -m
|
28 |
```
|
29 |
|
30 |
## ONNX Models
|
|
|
14 |
|
15 |
To easily get started with the model, you can use our ONNX Runtime Generate() API. See instructions [here](https://github.com/microsoft/onnxruntime/blob/gh-pages/docs/genai/tutorials/deepseek-python.md)
|
16 |
|
17 |
+
For CPU:
|
18 |
+
|
19 |
```bash
|
20 |
+
# Download the model directly using the Hugging Face CLI
|
21 |
+
huggingface-cli download onnxruntime/DeepSeek-R1-Distill-ONNX --include 'deepseek-r1-distill-qwen-1.5B/cpu_and_mobile/*' --local-dir .
|
22 |
+
|
23 |
+
# Install the CPU package of ONNX Runtime GenAI
|
24 |
+
pip install onnxruntime-genai
|
25 |
+
|
26 |
+
# Please adjust the model directory (-m) accordingly
|
27 |
+
curl -o https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/main/examples/python/model-chat.py
|
28 |
+
python model-chat.py -m /path/to/cpu-int4-rtn-block-32-acc-level-4/ -e cpu --chat_template "<|begin▁of▁sentence|><|User|>{input}<|Assistant|>"
|
29 |
```
|
30 |
|
31 |
+
For CUDA:
|
32 |
+
|
33 |
```bash
|
34 |
+
# Download the model directly using the Hugging Face CLI
|
35 |
+
huggingface-cli download onnxruntime/DeepSeek-R1-Distill-ONNX --include 'deepseek-r1-distill-qwen-1.5B/cuda/*' --local-dir .
|
36 |
+
|
37 |
+
# Install the CUDA package of ONNX Runtime GenAI
|
38 |
+
pip install onnxruntime-genai-cuda
|
39 |
+
|
40 |
+
# Please adjust the model directory (-m) accordingly
|
41 |
+
curl -o https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/main/examples/python/model-chat.py
|
42 |
+
python model-chat.py -m /path/to/cuda-int4-rtn-block-32/ -e cuda --chat_template "<|begin▁of▁sentence|><|User|>{input}<|Assistant|>"
|
43 |
+
```
|
44 |
+
|
45 |
+
For DirectML:
|
46 |
+
|
47 |
+
```bash
|
48 |
+
# Download the model directly using the Hugging Face CLI
|
49 |
+
huggingface-cli download onnxruntime/DeepSeek-R1-Distill-ONNX --include 'deepseek-r1-distill-qwen-1.5B/directml/*' --local-dir .
|
50 |
+
|
51 |
+
# Install the DirectML package of ONNX Runtime GenAI
|
52 |
+
pip install onnxruntime-genai-directml
|
53 |
|
54 |
+
# Please adjust the model directory (-m) accordingly
|
55 |
curl -o https://raw.githubusercontent.com/microsoft/onnxruntime-genai/refs/heads/main/examples/python/model-chat.py
|
56 |
+
python model-chat.py -m /path/to/directml-int4-rtn-block-32/ -e dml --chat_template "<|begin▁of▁sentence|><|User|>{input}<|Assistant|>"
|
57 |
```
|
58 |
|
59 |
## ONNX Models
|