license: apache-2.0
Repos
https://github.com/mit-han-lab/deepcompressor
Installation
https://github.com/mit-han-lab/deepcompressor/issues/56
https://github.com/nunchaku-tech/deepcompressor/issues/80
Windows
https://learn.microsoft.com/en-us/windows/wsl/install
https://www.anaconda.com/docs/getting-started/miniconda/install
Environment
python 3.10
cuda 12.8
torch 2.7
Quantization
https://github.com/nunchaku-tech/deepcompressor/blob/main/examples/diffusion/README.md
Model Path: https://github.com/nunchaku-tech/deepcompressor/issues/70#issuecomment-2788155233
Save model: --save-model true
or --save-model /PATH/TO/CHECKPOINT/DIR
Example: python -m deepcompressor.app.diffusion.ptq examples/diffusion/configs/model/flux.1-dev.yaml examples/diffusion/configs/svdquant/nvfp4.yaml
Folder Structure
Blockers
- NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
potential fix: app.diffusion.pipeline.config.py
@staticmethod
def _default_build(
name: str,
path: str,
dtype: str | torch.dtype,
device: str | torch.device,
shift_activations: bool
) -> DiffusionPipeline:
if not path:
if name == "sdxl":
path = "stabilityai/stable-diffusion-xl-base-1.0"
elif name == "sdxl-turbo":
path = "stabilityai/sdxl-turbo"
elif name == "pixart-sigma":
path = "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS"
elif name == "flux.1-dev":
path = "black-forest-labs/FLUX.1-dev"
elif name == "flux.1-canny-dev":
path = "black-forest-labs/FLUX.1-Canny-dev"
elif name == "flux.1-depth-dev":
path = "black-forest-labs/FLUX.1-Depth-dev"
elif name == "flux.1-fill-dev":
path = "black-forest-labs/FLUX.1-Fill-dev"
elif name == "flux.1-schnell":
path = "black-forest-labs/FLUX.1-schnell"
else:
raise ValueError(f"Path for {name} is not specified.")
# Instantiate the pipeline
if name in ["flux.1-canny-dev", "flux.1-depth-dev"]:
pipeline = FluxControlPipeline.from_pretrained(path, torch_dtype=dtype)
elif name == "flux.1-fill-dev":
pipeline = FluxFillPipeline.from_pretrained(path, torch_dtype=dtype)
elif name.startswith("sana-"):
if dtype == torch.bfloat16:
pipeline = SanaPipeline.from_pretrained(
path, variant="bf16", torch_dtype=dtype, use_safetensors=True
)
pipeline.vae.to(dtype)
pipeline.text_encoder.to(dtype)
else:
pipeline = SanaPipeline.from_pretrained(path, torch_dtype=dtype)
else:
pipeline = AutoPipelineForText2Image.from_pretrained(path, torch_dtype=dtype)
# Debug output
print(">>> DEVICE:", device)
print(">>> PIPELINE TYPE:", type(pipeline))
# Try to move each component using .to_empty()
for name in ["unet", "transformer", "vae", "text_encoder"]:
module = getattr(pipeline, name, None)
if isinstance(module, torch.nn.Module):
try:
print(f">>> Moving {name} to {device} using to_empty()")
module.to_empty(device)
except Exception as e:
print(f">>> WARNING: {name}.to_empty({device}) failed: {e}")
try:
print(f">>> Falling back to {name}.to({device})")
module.to(device)
except Exception as ee:
print(f">>> ERROR: {name}.to({device}) also failed: {ee}")
# Identify main model (for patching)
model = getattr(pipeline, "unet", None) or getattr(pipeline, "transformer", None)
if model is not None:
replace_fused_linear_with_concat_linear(model)
replace_up_block_conv_with_concat_conv(model)
if shift_activations:
shift_input_activations(model)
else:
print(">>> WARNING: No model (unet/transformer) found for patching")
return pipeline
Debug Log
25-07-21 22:47:02 | I | === Start Evaluating ===
25-07-21 22:47:02 | I | * Building diffusion model pipeline
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 15.44it/s]
Loading pipeline components...: 57%|████████████████████████████████████████████████████████████████ | 4/7 [00:00<00:00, 9.47it/s]
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 7.79it/s]
>>> DEVICE: cuda
>>> PIPELINE TYPE: <class 'diffusers.pipelines.flux.pipeline_flux.FluxPipeline'>
>>> Moving transformer to cuda using to_empty()
>>> WARNING: transformer.to_empty(cuda) failed: Module.to_empty() takes 1 positional argument but 2 were given
>>> Falling back to transformer.to(cuda)
>>> ERROR: transformer.to(cuda) also failed: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
>>> Moving vae to cuda using to_empty()
>>> WARNING: vae.to_empty(cuda) failed: Module.to_empty() takes 1 positional argument but 2 were given
>>> Falling back to vae.to(cuda)
>>> Moving text_encoder to cuda using to_empty()
>>> WARNING: text_encoder.to_empty(cuda) failed: Module.to_empty() takes 1 positional argument but 2 were given
>>> Falling back to text_encoder.to(cuda)
25-07-21 22:47:05 | I | Replacing fused Linear with ConcatLinear.
25-07-21 22:47:05 | I | + Replacing fused Linear in single_transformer_blocks.0 with ConcatLinear.
25-07-21 22:47:05 | I | - in_features = 3072/15360
25-07-21 22:47:05 | I | - out_features = 3072
25-07-21 22:47:05 | I | + Replacing fused Linear in single_transformer_blocks.1 with ConcatLinear.
25-07-21 22:47:05 | I | - in_features = 3072/15360
25-07-21 22:47:05 | I | - out_features = 3072
25-07-21 22:47:05 | I | + Replacing fused Linear in single_transformer_blocks.2 with ConcatLinear.
25-07-21 22:47:05 | I | - in_features = 3072/15360
25-07-21 22:47:05 | I | - out_features = 3072
- KeyError: <class 'diffusers.models.transformers.transformer_flux.FluxAttention'>
Dependencies
https://github.com/Dao-AILab/flash-attention
https://github.com/facebookresearch/xformers
https://github.com/openai/CLIP
https://github.com/THUDM/ImageReward
Wheels
https://huggingface.co/datasets/siraxe/PrecompiledWheels_Torch-2.8-cu128-cp312
https://huggingface.co/lldacing/flash-attention-windows-wheel