vfontech
/

Multiple-Input-Resshift-VFI

@@ -1,77 +1,11 @@
 ---
 tags:
-- video-frame-interpolation
-- diffusion-model
-- animation
-- uncertainty-estimation
-license: mit
-language:
-- en
-pipeline_tag: image-to-image
 ---
-# 🤖 Multi‑Input ResShift Diffusion VFI
-<div align="left" style="display: flex; flex-direction: row; gap: 15px">
-  <a href='https://arxiv.org/pdf/2504.05402'><img src='https://img.shields.io/badge/arXiv-2405.17933-b31b1b.svg'></a>
-  <a href='https://github.com/VicFonch/Multi-Input-Resshift-Diffusion-VFI'><img src='https://img.shields.io/badge/Repo-Code-blue'></a>
-  <a href='https://colab.research.google.com/drive/1MGYycbNMW6Mxu5MUqw_RW_xxiVeHK5Aa#scrollTo=EKaYCioiP3tQ'><img src='https://img.shields.io/badge/Colab-Demo-Green'></a>
-</div>
-## ⚙️ Setup
-Start by downloading the source code directly from GitHub.
-```bash
-git clone https://github.com/VicFonch/Multi-Input-Resshift-Diffusion-VFI.git
-```
-Create a conda environment and install all the requirements
-```bash
-conda create -n multi-input-resshift python=3.10
-conda activate multi-input-resshift
-pip install -r requirements.txt
-```
-**Note**: Make sure your system is compatible with **CUDA 12.4**. If not, install [CuPy](https://docs.cupy.dev/en/stable/install.html) according to your current CUDA version.
-## 🚀 Inference Example
-```python
-import os
-from PIL import Image
-import numpy as np
-import matplotlib.pyplot as plt
-from torchvision.transforms import Compose, ToTensor, Resize, Normalize
-from utils.utils import denorm
-from model.hub import MultiInputResShiftHub
-model = MultiInputResShiftHub.from_pretrained("vfontech/Multiple-Input-Resshift-VFI")
-model.requires_grad_(False).cuda().eval()
-img0_path = r"_data\example_images\frame1.png"
-img2_path = r"_data\example_images\frame3.png"
-transforms = Compose([
-    Resize((256, 448)),
-    ToTensor(),
-    Normalize(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]),
-])
-img0 = transforms(Image.open(img0_path).convert("RGB")).unsqueeze(0).cuda()
-img2 = transforms(Image.open(img2_path).convert("RGB")).unsqueeze(0).cuda()
-tau = 0.5
-img1 = model.reverse_process([img0, img2], tau)
-plt.figure(figsize=(10, 5))
-plt.subplot(1, 3, 1)
-plt.imshow(denorm(img0, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]).squeeze().permute(1, 2, 0).cpu().numpy())
-plt.subplot(1, 3, 2)
-plt.imshow(denorm(img1, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]).squeeze().permute(1, 2, 0).cpu().numpy())
-plt.subplot(1, 3, 3)
-plt.imshow(denorm(img2, mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5]).squeeze().permute(1, 2, 0).cpu().numpy())
-plt.show()
-```

 ---
+language: en
 tags:
+- model_hub_mixin
+- pytorch_model_hub_mixin
 ---
+This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
+- Code: https://github.com/VicFonch/Multi-Input-Resshift-Diffusion-VFI
+- Paper: https://arxiv.org/pdf/2504.05402
+- Docs: [More Information Needed]

config.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
   "etas_end": 0.99,
   "flow_kwargs": {
-    "pretrained_path": "_pretrain_models/anime_interp_full.ckpt"
   },
   "flow_model": "raft",
   "kappa": 2.0,

 {
   "etas_end": 0.99,
   "flow_kwargs": {
+    "pretrained_path": null
   },
   "flow_model": "raft",
   "kappa": 2.0,