File size: 2,913 Bytes
be4c430 34da673 8078775 be4c430 34da673 2d2059a 34da673 49ae0e0 b2ffb47 be4c430 076d26b af62223 34da673 076d26b 34da673 1fa0058 34da673 df92893 34da673 ba1093b 34da673 3a33e85 34da673 40b0e0d 34da673 e9808ec db9f5e6 34da673 a7e1dec 3a33e85 db9f5e6 076d26b 34da673 1fa0058 be4c430 253189c be4c430 253189c be4c430 253189c be4c430 253189c be4c430 253189c be4c430 1fa0058 be4c430 253189c be4c430 253189c be4c430 34da673 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 |
---
license: cc
tags:
- image-to-image
datasets:
- peter-sushko/RealEdit
pipeline_tag: image-to-image
---
# REALEDIT: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations
Project page: https://peter-sushko.github.io/RealEdit/
Data: https://huggingface.co/datasets/peter-sushko/RealEdit
Paper: https://arxiv.org/pdf/2502.03629
<img src="https://peter-sushko.github.io/RealEdit/static/images/teaser.svg"/>
**There are 2 ways to run inference: either via Diffusers or original InstructPix2Pix pipeline.**
## Option 1: With 🧨Diffusers:
Install necessary libraries:
```bash
pip install torch==2.7.0 diffusers==0.33.1 transformers==4.51.3 accelerate==1.6.0 pillow==11.2.1 requests==2.32.3
```
Then run:
```python
import torch
import requests
import PIL
from diffusers import StableDiffusionInstructPix2PixPipeline, EulerAncestralDiscreteScheduler
model_id = "peter-sushko/RealEdit"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(
model_id,
torch_dtype=torch.float16,
safety_checker=None
)
pipe.to("cuda")
pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
url = "https://raw.githubusercontent.com/AyanaBharadwaj/RealEdit/refs/heads/main/example_imgs/simba.jpg"
def download_image(url):
image = PIL.Image.open(requests.get(url, stream=True).raw)
image = PIL.ImageOps.exif_transpose(image)
image = image.convert("RGB")
return image
image = download_image(url)
prompt = "give him a crown"
result = pipe(prompt, image=image, num_inference_steps=50, image_guidance_scale=2).images[0]
result.save("output.png")
```
## Option 2: via InstructPix2Pix pipeline:
Clone the repository and set up the directory structure:
```bash
git clone https://github.com/timothybrooks/instruct-pix2pix.git
cd instruct-pix2pix
mkdir checkpoints
```
Download the fine-tuned checkpoint into the `checkpoints` directory:
```bash
cd checkpoints
# wget https://huggingface.co/peter-sushko/RealEdit/resolve/main/realedit_model.ckpt
```
Return to the repo root and follow the [InstructPix2Pix installation guide](https://github.com/timothybrooks/instruct-pix2pix) to set up the environment.
Edit a single image
```bash
python edit_cli.py \
--input [YOUR_IMG_PATH] \
--output imgs/output.jpg \
--edit "YOUR EDIT INSTRUCTION" \
--ckpt checkpoints/realedit_model.ckpt
```
## Citation
If you find this checkpoint helpful, please cite:
```
@misc{sushko2025realeditredditeditslargescale,
title={REALEDIT: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations},
author={Peter Sushko and Ayana Bharadwaj and Zhi Yang Lim and Vasily Ilin and Ben Caffee and Dongping Chen and Mohammadreza Salehi and Cheng-Yu Hsieh and Ranjay Krishna},
year={2025},
eprint={2502.03629},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2502.03629},
}
``` |