Update EXPORT.md
Browse files
EXPORT.md
CHANGED
|
@@ -36,19 +36,22 @@ Exports model part E by running QwenVL_Export_E.py.
|
|
| 36 |
|
| 37 |
Reduces ONNX model size by removing unnecessary elements for optimized deployment.
|
| 38 |
|
| 39 |
-
|
| 40 |
**quantize**
|
| 41 |
|
| 42 |
Quantizes all model parts (A, B, C, D, and E) to optimize size and performance.
|
| 43 |
|
| 44 |
**quantize-%**
|
|
|
|
| 45 |
Quantizes a specific model part (% can be A, B, C, D, or E) with targeted configurations.
|
| 46 |
|
| 47 |
**clean-large-files**
|
|
|
|
| 48 |
Deletes ONNX files larger than 2GB from the destination directory to retain models that will work for onnx environments.
|
| 49 |
|
| 50 |
**fix-gpu-buffers**
|
|
|
|
| 51 |
Applies fixes to GPU buffers in ONNX files for part E to ensure GPU memory compatibility.
|
| 52 |
|
| 53 |
**all**
|
|
|
|
| 54 |
Alias for all-in-one to run the full ONNX model preparation pipeline.
|
|
|
|
| 36 |
|
| 37 |
Reduces ONNX model size by removing unnecessary elements for optimized deployment.
|
| 38 |
|
|
|
|
| 39 |
**quantize**
|
| 40 |
|
| 41 |
Quantizes all model parts (A, B, C, D, and E) to optimize size and performance.
|
| 42 |
|
| 43 |
**quantize-%**
|
| 44 |
+
|
| 45 |
Quantizes a specific model part (% can be A, B, C, D, or E) with targeted configurations.
|
| 46 |
|
| 47 |
**clean-large-files**
|
| 48 |
+
|
| 49 |
Deletes ONNX files larger than 2GB from the destination directory to retain models that will work for onnx environments.
|
| 50 |
|
| 51 |
**fix-gpu-buffers**
|
| 52 |
+
|
| 53 |
Applies fixes to GPU buffers in ONNX files for part E to ensure GPU memory compatibility.
|
| 54 |
|
| 55 |
**all**
|
| 56 |
+
|
| 57 |
Alias for all-in-one to run the full ONNX model preparation pipeline.
|