File size: 5,664 Bytes
f4dc67d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3857a93
f4dc67d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
be96795
f4dc67d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
---
license: apache-2.0
language:
- en
base_model:
- Qwen/Qwen-Image-Edit
pipeline_tag: image-to-image
tags:
- lora
- qwen
- qwen-image
- qwen-image-edit
- image-editing
- inscene
- spatial-understanding
- scene-coherence
- computer-vision
- InScene
library_name: diffusers
---

# Qwen Image Edit Inscene LoRA

An open-source LoRA (Low-Rank Adaptation) model for Qwen-Image-Edit that specializes in in-scene image editing by [FlyMy.AI](https://flymy.ai).

## 🌟 About FlyMy.AI

Agentic Infra for GenAI. FlyMy.AI is a B2B infrastructure for building and running GenAI Media agents.

**🔗 Useful Links:**
- 🌐 [Official Website](https://flymy.ai)
- 📚 [Documentation](https://docs.flymy.ai/intro)
- 💬 [Discord Community](https://discord.com/invite/t6hPBpSebw)
- 🤗 [LoRA Training Repository](https://github.com/FlyMyAI/flymyai-lora-trainer)
- 🐦 [X (Twitter)](https://x.com/flymyai)
- 💼 [LinkedIn](https://linkedin.com/company/flymyai)
- 📺 [YouTube](https://youtube.com/@flymyai)
- 📸 [Instagram](https://www.instagram.com/flymy_ai)

---

## 🚀 Features

- LoRA-based fine-tuning for efficient in-scene image editing
- Specialized for Qwen-Image-Edit model
- Enhanced control over scene composition and object positioning
- Optimized for maintaining scene coherence during edits
- Compatible with Hugging Face `diffusers`
- Control-based image editing with improved spatial understanding

---

## 📦 Installation

1. Install required packages:
   ```bash
   pip install torch torchvision diffusers transformers accelerate
   ```

2. Install the latest `diffusers` from GitHub:
   ```bash
   pip install git+https://github.com/huggingface/diffusers
   ```

---

## 🧪 Usage

### 🔧 Qwen-Image-Edit Initialization

```python
from diffusers import QwenImageEditPipeline
import torch
from PIL import Image

# Load the pipeline
pipeline = QwenImageEditPipeline.from_pretrained("Qwen/Qwen-Image-Edit")
pipeline.to(torch.bfloat16)
pipeline.to("cuda")
```

### 🔌 Load LoRA Weights

```python
# Load trained LoRA weights for in-scene editing
pipeline.load_lora_weights("flymy-ai/qwen-image-edit-inscene-lora",weight_name="flymy_qwen_image_edit_inscene_lora.safetensors")
```

### 🎨 Edit Image with Qwen-Image-Edit Inscene LoRA

```python
# Load input image
image = Image.open("./assets/qie2_input.jpg").convert("RGB")

# Define in-scene editing prompt
prompt = "Make a shot in the same scene of the left hand securing the edge of the cutting board while the right hand tilts it, causing the chopped tomatoes to slide off into the pan, camera angle shifts slightly to the left to center more on the pan."

# Generate edited image with enhanced scene understanding
inputs = {
    "image": image,
    "prompt": prompt,
    "generator": torch.manual_seed(0),
    "true_cfg_scale": 4.0,
    "negative_prompt": " ",
    "num_inference_steps": 50,
}

with torch.inference_mode():
    output = pipeline(**inputs)
    output_image = output.images[0]
    output_image.save("edited_image.png")
```

### 🖼️ Sample Output - Qwen-Image-Edit Inscene

**Input Image:**

![Input Image](./assets/qie2_input.jpg)

**Prompt:** 
"Make a shot in the same scene of the left hand securing the edge of the cutting board while the right hand tilts it, causing the chopped tomatoes to slide off into the pan, camera angle shifts slightly to the left to center more on the pan."

**Output without LoRA:**

![Output without LoRA](./assets/qie2_orig.jpg)

**Output with Inscene LoRA:**

![Output with LoRA](./assets/qie2_lora.jpg)

---

### Workflow Features

- ✅ Pre-configured for Qwen-Image-Edit + Inscene LoRA inference
- ✅ Optimized settings for in-scene editing quality
- ✅ Enhanced spatial understanding and scene coherence
- ✅ Easy prompt and parameter adjustment
- ✅ Compatible with various input image types

---

## 🎯 What is Inscene LoRA?

This LoRA model is specifically trained to enhance Qwen-Image-Edit's ability to perform **in-scene image editing**. It focuses on:

- **Scene Coherence**: Maintaining logical spatial relationships within the scene
- **Object Positioning**: Better understanding of object placement and movement
- **Camera Perspective**: Improved handling of viewpoint changes and camera movements
- **Action Sequences**: Enhanced ability to depict sequential actions within the same scene
- **Contextual Editing**: Preserving scene context while making targeted modifications

---

## 🔧 Training Information

This LoRA model was trained using the [FlyMy.AI LoRA Trainer](https://github.com/FlyMyAI/flymyai-lora-trainer) with:

- **Base Model**: Qwen/Qwen-Image-Edit
- **Training Focus**: In-scene image editing and spatial understanding
- **Dataset**: Curated collection of scene-based editing examples (InScene dataset)
- **Optimization**: Low-rank adaptation for efficient fine-tuning

---

## 📊 Model Specifications

- **Model Type**: LoRA (Low-Rank Adaptation)
- **Base Model**: Qwen/Qwen-Image-Edit
- **File Format**: SafeTensors (.safetensors)
- **Specialization**: In-scene image editing
- **Training Framework**: Diffusers + Accelerate
- **Memory Efficient**: Optimized for consumer GPUs

---

## 🤝 Support

If you have questions or suggestions, join our community:

- 🌐 [FlyMy.AI](https://flymy.ai)
- 💬 [Discord Community](https://discord.com/invite/t6hPBpSebw)
- 🐦 [Follow us on X](https://x.com/flymyai)
- 💼 [Connect on LinkedIn](https://linkedin.com/company/flymyai)
- 📧 [Support](mailto:[email protected])

**⭐ Don't forget to star the repository if you like it!**

---

## 📄 License

This project is licensed under the Apache 2.0 License - see the LICENSE file for details.