Add how to run quantized into README.md
Browse files
README.md
CHANGED
@@ -71,6 +71,14 @@ Find more details in the original model card: https://huggingface.co/LiquidAI/LF
|
|
71 |
|
72 |
Example usage with [llama.cpp](https://github.com/ggml-org/llama.cpp):
|
73 |
|
|
|
|
|
74 |
```
|
75 |
llama-mtmd-cli -hf LiquidAI/LFM2-VL-1.6B-GGUF
|
76 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
71 |
|
72 |
Example usage with [llama.cpp](https://github.com/ggml-org/llama.cpp):
|
73 |
|
74 |
+
full precision (F16/F16):
|
75 |
+
|
76 |
```
|
77 |
llama-mtmd-cli -hf LiquidAI/LFM2-VL-1.6B-GGUF
|
78 |
```
|
79 |
+
|
80 |
+
fastest inference (Q4_0/Q8_0):
|
81 |
+
|
82 |
+
```
|
83 |
+
llama-mtmd-cli -hf LiquidAI/LFM2-VL-1.6B-GGUF:Q4_0
|
84 |
+
```
|