Image-to-Image
Diffusers
StableDiffusionPipeline
File size: 2,913 Bytes
be4c430
34da673
 
 
8078775
 
 
be4c430
 
34da673
2d2059a
34da673
49ae0e0
b2ffb47
be4c430
076d26b
af62223
34da673
076d26b
34da673
1fa0058
34da673
df92893
34da673
 
ba1093b
34da673
3a33e85
34da673
 
 
40b0e0d
 
34da673
 
e9808ec
db9f5e6
 
 
 
 
34da673
 
 
a7e1dec
3a33e85
 
 
 
 
 
 
db9f5e6
 
 
076d26b
34da673
 
1fa0058
be4c430
253189c
be4c430
253189c
be4c430
 
 
 
 
253189c
be4c430
253189c
be4c430
 
 
 
253189c
be4c430
1fa0058
be4c430
253189c
 
 
 
 
 
 
be4c430
 
 
253189c
be4c430
 
 
 
 
 
 
 
 
 
 
34da673
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
---
license: cc
tags:
- image-to-image
datasets:
- peter-sushko/RealEdit
pipeline_tag: image-to-image
---

# REALEDIT: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations
Project page: https://peter-sushko.github.io/RealEdit/  
Data: https://huggingface.co/datasets/peter-sushko/RealEdit

Paper: https://arxiv.org/pdf/2502.03629

<img src="https://peter-sushko.github.io/RealEdit/static/images/teaser.svg"/>  


**There are 2 ways to run inference: either via Diffusers or original InstructPix2Pix pipeline.**

## Option 1: With 🧨Diffusers:

Install necessary libraries:

```bash
pip install torch==2.7.0 diffusers==0.33.1 transformers==4.51.3 accelerate==1.6.0 pillow==11.2.1 requests==2.32.3
```
Then run:

```python
import torch
import requests
import PIL
from diffusers import StableDiffusionInstructPix2PixPipeline, EulerAncestralDiscreteScheduler

model_id = "peter-sushko/RealEdit"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(
    model_id,
    torch_dtype=torch.float16,
    safety_checker=None
)
pipe.to("cuda")
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)

url = "https://raw.githubusercontent.com/AyanaBharadwaj/RealEdit/refs/heads/main/example_imgs/simba.jpg"
def download_image(url):
    image = PIL.Image.open(requests.get(url, stream=True).raw)
    image = PIL.ImageOps.exif_transpose(image)
    image = image.convert("RGB")
    return image
image = download_image(url)

prompt = "give him a crown"
result = pipe(prompt, image=image, num_inference_steps=50, image_guidance_scale=2).images[0]
result.save("output.png")
```  


## Option 2: via InstructPix2Pix pipeline:

Clone the repository and set up the directory structure:

```bash
git clone https://github.com/timothybrooks/instruct-pix2pix.git
cd instruct-pix2pix
mkdir checkpoints
```

Download the fine-tuned checkpoint into the `checkpoints` directory:

```bash
cd checkpoints
# wget https://huggingface.co/peter-sushko/RealEdit/resolve/main/realedit_model.ckpt
```

Return to the repo root and follow the [InstructPix2Pix installation guide](https://github.com/timothybrooks/instruct-pix2pix) to set up the environment.

Edit a single image

```bash
python edit_cli.py \
  --input [YOUR_IMG_PATH] \
  --output imgs/output.jpg \
  --edit "YOUR EDIT INSTRUCTION" \
  --ckpt checkpoints/realedit_model.ckpt
```

## Citation

If you find this checkpoint helpful, please cite:

```
@misc{sushko2025realeditredditeditslargescale,
      title={REALEDIT: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations}, 
      author={Peter Sushko and Ayana Bharadwaj and Zhi Yang Lim and Vasily Ilin and Ben Caffee and Dongping Chen and Mohammadreza Salehi and Cheng-Yu Hsieh and Ranjay Krishna},
      year={2025},
      eprint={2502.03629},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2502.03629}, 
}
```