3ZadeSSG commited on
Commit
340bbd8
·
verified ·
1 Parent(s): ef6f1ef

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +143 -1
README.md CHANGED
@@ -7,4 +7,146 @@ tags:
7
  - multiplane
8
  - multiplane-image
9
  - mpi
10
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
  - multiplane
8
  - multiplane-image
9
  - mpi
10
+ ---
11
+
12
+
13
+ <div align="center">
14
+ <a href="#"><img src='https://img.shields.io/badge/-Paper-00629B?style=flat&logo=ieee&logoColor=white' alt='arXiv'></a>
15
+ <a href='https://realistic3d-miun.github.io/Research/RT_MPINet/index.html'><img src='https://img.shields.io/badge/Project_Page-Website-green?logo=googlechrome&logoColor=white' alt='Project Page'></a>
16
+ <a href='https://huggingface.co/spaces/3ZadeSSG/RT-MPINet'><img src='https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Demo_(RT_MPINet)-blue'></a>
17
+ </div>
18
+
19
+ # RT-MPINet
20
+ #### Real-Time View Synthesis with Multiplane Image Network using Multimodal Supervision (RT-MPINet)
21
+
22
+ We present a real-time multiplane image (MPI) network. Unlike existing MPI based approaches that often rely on a separate depth estimation network to guide the network for estimating MPI parameters, our method directly predicts these parameters from a single RGB image. To guide the network we present a multimodal training strategy utilizing joint supervision from view synthesis and depth estimation losses. More details can be found in the paper.
23
+
24
+ **Please head to the [Project Page](https://realistic3d-miun.github.io/Research/RT_MPINet/index.html) to see supplementary materials**
25
+
26
+ ## Setup
27
+
28
+ 1. Clone the GitHub repository:
29
+ ```bash
30
+ git clone https://github.com/Realistic3D-MIUN/RT-MPINet
31
+ cd RT-MPINet
32
+ ```
33
+ 2. Install dependencies:
34
+ ```bash
35
+ pip install -r requirements.txt
36
+ ```
37
+ 3. Install PyTorch3D after the general libs have been installed
38
+ ```bash
39
+ pip install "pytorch3d @ git+https://github.com/facebookresearch/pytorch3d.git@89653419d0973396f3eff1a381ba09a07fffc2ed"
40
+ ```
41
+
42
+
43
+ ## Checkpoints (Best Checkpoints Will Be Updated Soon)
44
+ Pretrained model checkpoints should be placed in the `checkpoint/` directory. Example filenames:
45
+ - `checkpoint_RT_MPI_Small.pth`
46
+ - `checkpoint_RT_MPI_Medium.pth`
47
+ - `checkpoint_RT_MPI_Large.pth`
48
+
49
+ | Model | Size | Parameters | Checkpoint |
50
+ |-----------------|--------|------------|----------------|
51
+ | Small | 26 MB | 6.6 Million| [Download](https://huggingface.co/3ZadeSSG/RT-MPINet/resolve/main/checkpoint_RT_MPI_Small.pth) |
52
+ | Medium (Default)| 278 MB | 69 Million | [Download](https://huggingface.co/3ZadeSSG/RT-MPINet/resolve/main/checkpoint_RT_MPI_Medium.pth) |
53
+ | Large | 1.2 GB | 288 Million| [Download](https://huggingface.co/3ZadeSSG/RT-MPINet/resolve/main/checkpoint_RT_MPI_Large.pth) |
54
+
55
+ ## Usage
56
+
57
+ ### 1. Live Rendering Demo
58
+ You can load any image and run the model inference each time the camera position is changed. This will be limited to the inference speed on your GPU.
59
+ ```bash
60
+ python renderLiveWithMouseControl.py \
61
+ --input_image <path_to_image> \
62
+ --model_type <small|medium|large> \
63
+ --checkpoint_path <path_to_checkpoint> \
64
+ --height <height> \
65
+ --width <width>
66
+ ```
67
+ Example:
68
+ ```bash
69
+ python renderLiveWithMouseControl.py \
70
+ --input_image ./samples/moon.jpg \
71
+ --model_type medium \
72
+ --checkpoint_path ./checkpoint/checkpoint_RT_MPI_Medium.pth \
73
+ --height 256 \
74
+ --width 256
75
+ ```
76
+
77
+ ### 2. Inference: Predict MPIs from an image and render afterwards
78
+ The predicted MPIs can be used for offline rendering, which is much faster as the model isn't being queried each time camera changes. This requires
79
+
80
+ * First predicting the MPIs
81
+ ```bash
82
+ python predictMPIs.py \
83
+ --input_image <path_to_image> \
84
+ --model_type <small|medium|large> \
85
+ --checkpoint_path <path_to_checkpoint> \
86
+ --save_dir <output_dir> \
87
+ --height <height> \
88
+ --width <width>
89
+ ```
90
+
91
+ * Second the MPIs are loaded and views are rendered without invoking the model using
92
+ ```bash
93
+ python renderPreProcessedWithMouseControl.py \
94
+ --layer_path <output_dir> \
95
+ --height <height> \
96
+ --width <width>
97
+ ```
98
+
99
+ Example:
100
+ ```bash
101
+ python predictMPIs.py \
102
+ --input_image ./samples/moon.jpg \
103
+ --model_type medium \
104
+ --checkpoint_path ./checkpoint/checkpoint_RT_MPI_Medium.pth \
105
+ --save_dir ./processedLayers/ \
106
+ --height 384 \
107
+ --width 384
108
+ ```
109
+
110
+ ```bash
111
+ python renderPreProcessedWithMouseControl.py \
112
+ --layer_path ./processedLayers/ \
113
+ --height 384 \
114
+ --width 384
115
+ ```
116
+
117
+
118
+ ### 3. Web Demo (Gradio)
119
+ You can run the local demo of the Huggingface app to utilize your own GPU for faster inference using
120
+ ```bash
121
+ python app.py
122
+ ```
123
+
124
+
125
+ ## Supported Resolutions
126
+ We have tested our model with following resolutions:
127
+ - 256x256
128
+ - 384x384
129
+ - 512x512
130
+ - 256x384
131
+ - 384x512
132
+
133
+ **Note:** If using non square aspect ratio, you need to modify the torch transform to account for changes.
134
+
135
+ ## Acknowledgements
136
+ - We thank the authors of [AdaMPI](https://github.com/yxuhan/AdaMPI) for their implementation of the homography renderer which has been used in this codebase under `./utils` directory
137
+ - We tank the author of [Deepview renderer](https://github.com/Findeton/deepview) template, which was used in our project page.
138
+
139
+
140
+ ## Citation
141
+ If you use our work please use following citation:
142
+ ```
143
+ @inproceedings{gond2025rtmpi,
144
+ title={Real-Time View Synthesis with Multiplane Image Network using Multimodal Supervision},
145
+ author={Gond, Manu and Shamshirgarha, Mohammadreza and Zerman, Emin and Knorr, Sebastian and Sj{\"o}str{\"o}m, M{\aa}rten},
146
+ booktitle={2025 IEEE 27th International Workshop on Multimedia Signal Processing (MMSP)},
147
+ pages={},
148
+ year={2025},
149
+ organization={IEEE}
150
+ }
151
+ ```
152
+