makn87amd's picture
Update README.md
ee47645 verified
# ESRGAN Inference (C++)
A C++ command-line tool for running ESRGAN (Enhanced Super-Resolution Generative Adversarial Networks) inference using ONNX Runtime with support for both CPU and AMD NPU (Neural Processing Unit) execution via VitisAI Execution Provider.
## Overview
This tool performs image super-resolution using pre-trained ESRGAN models, taking low-resolution images as input and producing high-resolution upscaled outputs. It supports execution on both CPU and AMD Ryzen AI NPUs for accelerated inference.
## Prerequisites
### Required Software
- **Windows 10/11** (x64 architecture)
- **Visual Studio 2019/2022** with C++17 support
- **CMake 3.5** or higher
- **AMD Ryzen AI SDK 1.4** (for NPU support)
- **OpenCV 4.12** (included in project at `./opencv/build`)
### Required Files
- **ONNX Model**: ESRGAN model file (`esrgan_fp32_qdq.onnx`)
- **Config JSON**: VitisAI configuration file (`esrgan_config.json`)
- **Runtime DLLs**: Ryzen AI runtime libraries (automatically copied from Ryzen AI SDK during build)
- **XCLBIN Files**: NPU binary files (automatically copied from Ryzen AI SDK during build)
- **OpenCV Dependencies**: OpenCV DLLs (included in `opencv/build/x64/vc16/bin/`)
### Hardware Requirements
- For NPU acceleration: AMD Ryzen AI compatible processor (Phoenix 0x1503 or Strix 0x17F0)
- Minimum 8GB RAM recommended
- Sufficient disk space for models and output images
## Building the Project
The build process automatically copies all required dependencies to the output directory, making the executable completely self-contained for distribution.
### Prerequisites for Building
- Ryzen AI SDK 1.4 must be installed and the `RYZEN_AI_INSTALLATION_PATH` environment variable set
- Visual Studio 2019/2022 with C++17 support
- CMake 3.5 or higher
### Option 1: Using the Build Script (Recommended)
1. Set the Ryzen AI SDK environment variable:
```cmd
set RYZEN_AI_INSTALLATION_PATH=C:\Program Files\RyzenAI\1.4.0
```
2. Run the build script:
```cmd
compile.bat
```
### Option 2: Manual Build
1. Set the environment variable:
```cmd
set RYZEN_AI_INSTALLATION_PATH=C:\Program Files\RyzenAI\1.4.0
```
2. Generate build files with CMake:
```cmd
cmake -DCMAKE_CONFIGURATION_TYPES=Release -A x64 -T host=x64 -B build -S . -G "Visual Studio 17 2022"
```
3. Build the project:
```cmd
cmake --build .\build --config Release
```
## Files Automatically Copied During Build
The CMakeLists.txt configuration automatically copies the following files to `build/Release/`, creating a fully self-contained executable directory:
### Model and Configuration Files
- **esrgan_config.json** - VitisAI configuration file
- **esrgan_fp32_qdq.onnx** - ONNX model file
### NPU Binary Files
- **xclbins/** directory structure containing NPU binaries for different device types:
- Phoenix NPUs (device ID: 0x1503)
- Strix NPUs (device ID: 0x17F0)
- Copied from: `${RYZEN_AI_INSTALLATION_PATH}/voe-4.0-win_amd64/xclbins/`
### Runtime Dependencies
- **OpenCV DLLs** from `opencv/build/x64/vc16/bin/` (including `opencv_world412.dll`)
- **Ryzen AI DLLs** from local `RAI_dll/` directory:
- `onnxruntime.dll` - Core ONNX Runtime engine
- `onnxruntime_providers_vitisai.dll` - VitisAI execution provider
- `onnxruntime_vitisai_ep.dll` - VitisAI EP interface layer
- `onnxruntime_vitis_ai_custom_ops.dll` - Custom operations support
- `DirectML.dll` - DirectML runtime support
- `xclbin.dll` - XCLBIN file handling utilities
- `transaction.dll` - Transaction management
- `dyn_dispatch_core.dll` - Dynamic dispatch core
- **VitisAI Runtime** files from `${RYZEN_AI_INSTALLATION_PATH}/deployment/voe`
## Project Structure
### Source Files
```
src/
├── main.cpp # Main application logic
├── npu_util.cpp # NPU utility functions
├── npu_util.h # NPU utility headers
└── cxxopts.hpp # Command-line argument parsing (header-only library)
```
### Project Root
```
./
├── bird_input.png # Sample input image
├── CMakeLists.txt # Build configuration
├── compile.bat # Build script for Windows
├── esrgan_config.json # VitisAI EP configuration
├── esrgan_fp32_qdq.onnx # ONNX model file
├── esrgan_cache/ # VitisAI compilation cache (generated after first run)
├── opencv/ # Local OpenCV installation
│ └── build/ # Pre-built OpenCV binaries and headers
└── src/ # Source code directory
```
### After Build (`build/Release/`)
```
build/Release/
├── esrgan_inference.exe # Main executable
├── esrgan_fp32_qdq.onnx # ONNX model (copied from root)
├── esrgan_config.json # Configuration (copied from root)
├── opencv_world412.dll # OpenCV runtime (from opencv/build/x64/vc16/bin/)
├── onnxruntime.dll # ONNX Runtime core (from RAI_dll/)
├── onnxruntime_providers_vitisai.dll # VitisAI provider (from RAI_dll/)
├── onnxruntime_vitisai_ep.dll # VitisAI EP interface (from RAI_dll/)
├── onnxruntime_vitis_ai_custom_ops.dll # Custom ops support (from RAI_dll/)
├── DirectML.dll # DirectML runtime (from RAI_dll/)
├── xclbin.dll # XCLBIN utilities (from RAI_dll/)
├── transaction.dll # Transaction management (from RAI_dll/)
├── dyn_dispatch_core.dll # Dynamic dispatch (from RAI_dll/)
├── [additional VitisAI files] # Other runtime components from SDK deployment
└── xclbins/ # NPU binaries (from Ryzen AI SDK)
├── phoenix/ # Phoenix NPU binaries (device ID: 0x1503)
└── strix/ # Strix NPU binaries (device ID: 0x17F0)
```
**Note**: The build output directory is completely self-contained and portable. It can be copied to other compatible systems and run without additional installation requirements.
## Usage
### Command-Line Syntax
```cmd
esrgan_inference.exe [OPTIONS]
```
### Required Arguments
- `-m, --model <file>` : ONNX model filename (relative to executable directory)
- `-c, --config <file>` : JSON configuration filename (relative to executable directory)
### Optional Arguments
- `-i, --input_image <file>` : Input image file (default: `input_image.png`)
- `-o, --output_image <file>` : Output image file (default: `output_image.png`)
- `-k, --cache_key <string>` : Cache key for VitisAI EP (default: empty)
- `-d, --cache_dir <string>` : Cache directory for VitisAI EP (default: empty)
- `-x, --xclbin <file>` : XCLBIN filename for NPU (default: auto-selected)
- `-h, --help` : Show help message
### Example Usage
```cmd
esrgan_inference.exe ^
-m esrgan_fp32_qdq.onnx ^
-c esrgan_config.json ^
-i ..\..\bird_input.png ^
-o bird_output.png ^
-d ..\..\esrgan_cache ^
-k esrgan_cache
```
This example demonstrates:
- Using model and config files from the build directory (automatically copied)
- Relative paths for input images (from project root)
- Using the existing VitisAI cache directory for faster subsequent runs
**Note**: The `esrgan_cache/` directory is created automatically during the first NPU inference run. It contains compiled model artifacts that significantly speed up subsequent runs. You can point to this existing cache using the `-d` and `-k` parameters.
## Console Output Example
```
-------------------------------------------------------
Configuration Parameters:
-------------------------------------------------------
Executable Directory: C:\Users\user\Desktop\QuickTest\ESRGAN_Inference_cpp\build\Release
Model Path : C:\Users\user\Desktop\QuickTest\ESRGAN_Inference_cpp\build\Release\esrgan_fp32_qdq.onnx
Config JSON Path : C:\Users\user\Desktop\QuickTest\ESRGAN_Inference_cpp\build\Release\esrgan_config.json
Cache Key : esrgan_cache
Cache Directory : ..\..\
Input image : ..\..\bird_input.png
Output image : bird_output.png
-------------------------------------------------------
[INFO] Model input name: input
[INFO] Input data type: 1
[INFO] Input dims: [ 1 250 250 3 ]
[INFO] Model output name: output
[INFO] Output data type: 1
[INFO] Output dims: [ 1 1000 1000 3 ]
[INFO] Running inference...
-------------------------------------------------------
Performing compatibility check for VitisAI EP 1.4
-------------------------------------------------------
- NPU Device ID : 0x1503
- NPU Device Name : AMD IPU Device
- NPU Driver Version: 32.0.203.257
Environment compatible for VitisAI EP
[INFO] Writing upscaled image to: bird_output.png
[INFO] Done.
```
MIT License - See source files for full license text.
## Dependencies and Attribution
This project includes and uses the following components:
- **AMD Ryzen AI SDK 1.4** - NPU acceleration support
- **ONNX Runtime** - Model inference engine (runtime DLLs included in `RAI_dll/`)
- **OpenCV 4.12** - Image processing (pre-built binaries included in `opencv/build/`)
- **cxxopts** - Command-line argument parsing (header-only, included in `src/`)
**Note**: This project is largely self-contained. The required ONNX Runtime and VitisAI provider DLLs are included in the `RAI_dll/` directory, and OpenCV binaries are included in the `opencv/` directory. The only external dependency is the Ryzen AI SDK installation for XCLBIN files and additional runtime components.
## Source Code References
- **Main Application**: [`src/main.cpp`](src/main.cpp)
- **NPU Utilities**: [`src/npu_util.cpp`](src/npu_util.cpp), [`src/npu_util.h`](src/npu_util.h)
- **Command-Line Parsing**: [`src/cxxopts.hpp`](src/cxxopts.hpp)
- **Build Configuration**: [`CMakeLists.txt`](CMakeLists.txt)
- **Build Script**: [`compile.bat`](compile.bat)