Flux.1-Merged / README.md

Update README.md

22d7363 verified 2 months ago

4.79 kB

	---
	language:
	- en
	library_name: diffusers
	license: other
	license_name: flux-1-dev-non-commercial-license
	license_link: LICENSE.md
	base_model:
	- black-forest-labs/FLUX.1-dev
	- black-forest-labs/FLUX.1-schnell
	base_model_relation: merge
	pipeline_tag: text-to-image
	---

	# FLUX.1-Merged (FLUX.1-dev + FLUX.1-schnell)

	> FLUX.1-Merged is a repository offering the merged parameters of black-forest-labs/FLUX.1-dev and black-forest-labs/FLUX.1-schnell, two powerful text-to-image transformer models. By averaging the non-guidance parameters from both models and integrating the guidance components from FLUX.1-dev, this merge aims to deliver a model that combines the strengths of both—FLUX.1-dev’s high prompt fidelity and competitive visual quality, and FLUX.1-schnell’s enhancements such as greater speed or aesthetic tuning. The final checkpoint can be used for prompt-based image generation with the Diffusers library, providing efficient and high-quality outputs for creative and research applications. Detailed instructions for merging, saving, and using the model with Diffusers are included, allowing users to generate images from text prompts with improved performance leveraging innovations from both base models.

	# Sub-Memory-efficient merging code

	```python
	from diffusers import FluxTransformer2DModel
	from huggingface_hub import snapshot_download
	from huggingface_hub import upload_folder
	from accelerate import init_empty_weights
	from diffusers.models.model_loading_utils import load_model_dict_into_meta
	import safetensors.torch
	import glob
	import torch


	# Initialize the model with empty weights
	with init_empty_weights():
	config = FluxTransformer2DModel.load_config("black-forest-labs/FLUX.1-dev", subfolder="transformer")
	model = FluxTransformer2DModel.from_config(config)

	# Download the model checkpoints
	dev_ckpt = snapshot_download(repo_id="black-forest-labs/FLUX.1-dev", allow_patterns="transformer/*")
	schnell_ckpt = snapshot_download(repo_id="black-forest-labs/FLUX.1-schnell", allow_patterns="transformer/*")

	# Get the paths to the model shards
	dev_shards = sorted(glob.glob(f"{dev_ckpt}/transformer/*.safetensors"))
	schnell_shards = sorted(glob.glob(f"{schnell_ckpt}/transformer/*.safetensors"))

	# Merge the state dictionaries
	merged_state_dict = {}
	guidance_state_dict = {}

	for i in range(len(dev_shards)):
	state_dict_dev_temp = safetensors.torch.load_file(dev_shards[i])
	state_dict_schnell_temp = safetensors.torch.load_file(schnell_shards[i])

	keys = list(state_dict_dev_temp.keys())
	for k in keys:
	if "guidance" not in k:
	merged_state_dict[k] = (state_dict_dev_temp.pop(k) + state_dict_schnell_temp.pop(k)) / 2
	else:
	guidance_state_dict[k] = state_dict_dev_temp.pop(k)

	if len(state_dict_dev_temp) > 0:
	raise ValueError(f"There should not be any residue but got: {list(state_dict_dev_temp.keys())}.")
	if len(state_dict_schnell_temp) > 0:
	raise ValueError(f"There should not be any residue but got: {list(state_dict_schnell_temp.keys())}.")

	# Update the merged state dictionary with the guidance state dictionary
	merged_state_dict.update(guidance_state_dict)

	# Load the merged state dictionary into the model
	load_model_dict_into_meta(model, merged_state_dict)

	# Save the merged model
	model.to(torch.bfloat16).save_pretrained("transformer")

	# Upload the merged model to the Hugging Face Hub
	upload_folder(
	repo_id="prithivMLmods/Flux.1-Merged", # Replace with your Hugging Face username and desired repo name
	folder_path="transformer",
	path_in_repo="transformer",
	)
	```
	# Inference

	```python
	from diffusers import FluxPipeline
	import torch

	pipeline = FluxPipeline.from_pretrained(
	"prithivMLmods/Flux.1-Merged", torch_dtype=torch.bfloat16
	).to("cuda")
	image = pipeline(
	prompt="a tiny astronaut hatching from an egg on the moon",
	guidance_scale=3.5,
	num_inference_steps=4,
	height=880,
	width=1184,
	max_sequence_length=512,
	generator=torch.manual_seed(0),
	).images[0]
	image.save("merged_flux.png")
	```

	---


	## For more information, visit the documentation.

	> Flux is a suite of state-of-the-art text-to-image generation models based on diffusion transformers, developed by Black Forest Labs. The models are designed for high-quality generative image tasks, including text-to-image, inpainting, outpainting, and advanced structure or depth-controlled workflows. Flux is available through the Hugging Face diffusers library.

	For detailed guides, examples, and API refer to:
	- [Main Flux Pipeline Documentation](https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux)
	- [Flux Transformer Model Documentation](https://huggingface.co/docs/diffusers/main/en/api/models/flux_transformer)