|
--- |
|
language: |
|
- en |
|
library_name: diffusers |
|
license: other |
|
license_name: flux-1-dev-non-commercial-license |
|
license_link: LICENSE.md |
|
base_model: |
|
- black-forest-labs/FLUX.1-dev |
|
- black-forest-labs/FLUX.1-schnell |
|
base_model_relation: merge |
|
pipeline_tag: text-to-image |
|
--- |
|
|
|
# **FLUX.1-Merged (FLUX.1-dev + FLUX.1-schnell)** |
|
|
|
> FLUX.1-Merged is a repository offering the merged parameters of black-forest-labs/FLUX.1-dev and black-forest-labs/FLUX.1-schnell, two powerful text-to-image transformer models. By averaging the non-guidance parameters from both models and integrating the guidance components from FLUX.1-dev, this merge aims to deliver a model that combines the strengths of both—FLUX.1-dev’s high prompt fidelity and competitive visual quality, and FLUX.1-schnell’s enhancements such as greater speed or aesthetic tuning. The final checkpoint can be used for prompt-based image generation with the Diffusers library, providing efficient and high-quality outputs for creative and research applications. Detailed instructions for merging, saving, and using the model with Diffusers are included, allowing users to generate images from text prompts with improved performance leveraging innovations from both base models. |
|
|
|
# **Sub-Memory-efficient merging code** |
|
|
|
```python |
|
from diffusers import FluxTransformer2DModel |
|
from huggingface_hub import snapshot_download |
|
from huggingface_hub import upload_folder |
|
from accelerate import init_empty_weights |
|
from diffusers.models.model_loading_utils import load_model_dict_into_meta |
|
import safetensors.torch |
|
import glob |
|
import torch |
|
|
|
|
|
# Initialize the model with empty weights |
|
with init_empty_weights(): |
|
config = FluxTransformer2DModel.load_config("black-forest-labs/FLUX.1-dev", subfolder="transformer") |
|
model = FluxTransformer2DModel.from_config(config) |
|
|
|
# Download the model checkpoints |
|
dev_ckpt = snapshot_download(repo_id="black-forest-labs/FLUX.1-dev", allow_patterns="transformer/*") |
|
schnell_ckpt = snapshot_download(repo_id="black-forest-labs/FLUX.1-schnell", allow_patterns="transformer/*") |
|
|
|
# Get the paths to the model shards |
|
dev_shards = sorted(glob.glob(f"{dev_ckpt}/transformer/*.safetensors")) |
|
schnell_shards = sorted(glob.glob(f"{schnell_ckpt}/transformer/*.safetensors")) |
|
|
|
# Merge the state dictionaries |
|
merged_state_dict = {} |
|
guidance_state_dict = {} |
|
|
|
for i in range(len(dev_shards)): |
|
state_dict_dev_temp = safetensors.torch.load_file(dev_shards[i]) |
|
state_dict_schnell_temp = safetensors.torch.load_file(schnell_shards[i]) |
|
|
|
keys = list(state_dict_dev_temp.keys()) |
|
for k in keys: |
|
if "guidance" not in k: |
|
merged_state_dict[k] = (state_dict_dev_temp.pop(k) + state_dict_schnell_temp.pop(k)) / 2 |
|
else: |
|
guidance_state_dict[k] = state_dict_dev_temp.pop(k) |
|
|
|
if len(state_dict_dev_temp) > 0: |
|
raise ValueError(f"There should not be any residue but got: {list(state_dict_dev_temp.keys())}.") |
|
if len(state_dict_schnell_temp) > 0: |
|
raise ValueError(f"There should not be any residue but got: {list(state_dict_schnell_temp.keys())}.") |
|
|
|
# Update the merged state dictionary with the guidance state dictionary |
|
merged_state_dict.update(guidance_state_dict) |
|
|
|
# Load the merged state dictionary into the model |
|
load_model_dict_into_meta(model, merged_state_dict) |
|
|
|
# Save the merged model |
|
model.to(torch.bfloat16).save_pretrained("transformer") |
|
|
|
# Upload the merged model to the Hugging Face Hub |
|
upload_folder( |
|
repo_id="prithivMLmods/Flux.1-Merged", # Replace with your Hugging Face username and desired repo name |
|
folder_path="transformer", |
|
path_in_repo="transformer", |
|
) |
|
``` |
|
# **Inference** |
|
|
|
```python |
|
from diffusers import FluxPipeline |
|
import torch |
|
|
|
pipeline = FluxPipeline.from_pretrained( |
|
"prithivMLmods/Flux.1-Merged", torch_dtype=torch.bfloat16 |
|
).to("cuda") |
|
image = pipeline( |
|
prompt="a tiny astronaut hatching from an egg on the moon", |
|
guidance_scale=3.5, |
|
num_inference_steps=4, |
|
height=880, |
|
width=1184, |
|
max_sequence_length=512, |
|
generator=torch.manual_seed(0), |
|
).images[0] |
|
image.save("merged_flux.png") |
|
``` |
|
|
|
--- |
|
|
|
|
|
## For more information, visit the documentation. |
|
|
|
> Flux is a suite of state-of-the-art text-to-image generation models based on diffusion transformers, developed by Black Forest Labs. The models are designed for high-quality generative image tasks, including text-to-image, inpainting, outpainting, and advanced structure or depth-controlled workflows. Flux is available through the Hugging Face diffusers library. |
|
|
|
For detailed guides, examples, and API refer to: |
|
- **[Main Flux Pipeline Documentation](https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux)** |
|
- **[Flux Transformer Model Documentation](https://huggingface.co/docs/diffusers/main/en/api/models/flux_transformer)** |