File size: 8,376 Bytes
52baaab c22e483 52baaab 2266161 7cae33f 9e8dd87 84ccd75 f2a58b9 84ccd75 68d414b d6ed62a f062324 68d414b f062324 68d414b 1f1a0f8 3ce174d 1f1a0f8 2ec570a 68d414b 2ec570a 68d414b f062324 4acff3b 9b4f8e6 5bca1b8 9b4f8e6 d237588 02469b3 5bca1b8 f807ff6 d237588 d350f56 8cbc120 02469b3 8cbc120 02469b3 8cbc120 02469b3 8cbc120 02469b3 8cbc120 02469b3 8cbc120 c724a07 79a04cc 02469b3 3dc5d63 02469b3 79a04cc 02469b3 79a04cc 02469b3 79a04cc d350f56 c04fadc b1b6194 d350f56 fae9f22 1f1a0f8 d6ed62a 6917744 369d5ab 190d547 d66c08a 7cae33f cac27a3 190d547 cac27a3 0523cf8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
---
license: apache-2.0
---
# Repos
https://github.com/mit-han-lab/deepcompressor
# Installation
https://github.com/mit-han-lab/deepcompressor/issues/56
https://github.com/nunchaku-tech/deepcompressor/issues/80
# Windows
https://learn.microsoft.com/en-us/windows/wsl/install
https://www.anaconda.com/docs/getting-started/miniconda/install
# Environment
python 3.12
cuda 12.8
torch 2.7
diffusers https://github.com/huggingface/diffusers
transformers 4.51
# Calibration
https://github.com/nunchaku-tech/deepcompressor/blob/main/examples/diffusion/README.md#step-2-calibration-dataset-preparation
# Quantization
https://github.com/nunchaku-tech/deepcompressor/blob/main/examples/diffusion/README.md#step-3-model-quantization
Model Path: https://github.com/nunchaku-tech/deepcompressor/issues/70#issuecomment-2788155233
Save model: `--save-model true` or `--save-model /PATH/TO/CHECKPOINT/DIR`
Example: `python -m deepcompressor.app.diffusion.ptq examples/diffusion/configs/model/flux.1-dev.yaml examples/diffusion/configs/svdquant/nvfp4.yaml`
Folder Structure
- refer [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev/tree/main)
- refer [black-forest-labs/FLUX.1-Kontext-dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/tree/main)
---
# Blockers
1) NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
potential fix: app.diffusion.pipeline.config.py
```python
@staticmethod
def _default_build(
name: str, path: str, dtype: str | torch.dtype, device: str | torch.device, shift_activations: bool
) -> DiffusionPipeline:
if not path:
if name == "sdxl":
path = "stabilityai/stable-diffusion-xl-base-1.0"
elif name == "sdxl-turbo":
path = "stabilityai/sdxl-turbo"
elif name == "pixart-sigma":
path = "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS"
elif name == "flux.1-kontext-dev":
path = "black-forest-labs/FLUX.1-Kontext-dev"
elif name == "flux.1-dev":
path = "black-forest-labs/FLUX.1-dev"
elif name == "flux.1-canny-dev":
path = "black-forest-labs/FLUX.1-Canny-dev"
elif name == "flux.1-depth-dev":
path = "black-forest-labs/FLUX.1-Depth-dev"
elif name == "flux.1-fill-dev":
path = "black-forest-labs/FLUX.1-Fill-dev"
elif name == "flux.1-schnell":
path = "black-forest-labs/FLUX.1-schnell"
else:
raise ValueError(f"Path for {name} is not specified.")
if name in ["flux.1-kontext-dev"]:
pipeline = FluxKontextPipeline.from_pretrained(path, torch_dtype=dtype)
elif name in ["flux.1-canny-dev", "flux.1-depth-dev"]:
pipeline = FluxControlPipeline.from_pretrained(path, torch_dtype=dtype)
elif name == "flux.1-fill-dev":
pipeline = FluxFillPipeline.from_pretrained(path, torch_dtype=dtype)
elif name.startswith("sana-"):
if dtype == torch.bfloat16:
pipeline = SanaPipeline.from_pretrained(path, variant="bf16", torch_dtype=dtype, use_safetensors=True)
pipeline.vae.to(dtype)
pipeline.text_encoder.to(dtype)
else:
pipeline = SanaPipeline.from_pretrained(path, torch_dtype=dtype)
else:
pipeline = AutoPipelineForText2Image.from_pretrained(path, torch_dtype=dtype)
# Debug output
print(">>> DEVICE:", device)
print(">>> PIPELINE TYPE:", type(pipeline))
# Try to move each component using .to_empty()
for name in ["unet", "transformer", "vae", "text_encoder"]:
module = getattr(pipeline, name, None)
if isinstance(module, torch.nn.Module):
try:
print(f">>> Moving {name} to {device} using to_empty()")
module.to_empty(device)
except Exception as e:
print(f">>> WARNING: {name}.to_empty({device}) failed: {e}")
try:
print(f">>> Falling back to {name}.to({device})")
module.to(device)
except Exception as ee:
print(f">>> ERROR: {name}.to({device}) also failed: {ee}")
# Identify main model (for patching)
model = getattr(pipeline, "unet", None) or getattr(pipeline, "transformer", None)
if model is not None:
replace_fused_linear_with_concat_linear(model)
replace_up_block_conv_with_concat_conv(model)
if shift_activations:
shift_input_activations(model)
else:
print(">>> WARNING: No model (unet/transformer) found for patching")
return pipeline
```
Debug Log
```
25-07-22 20:11:56 | I | === Start Evaluating ===
25-07-22 20:11:56 | I | * Building diffusion model pipeline
Loading pipeline components...: 0%| | 0/7 [00:00<?, ?it/s]
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 18.92it/s]
Loading pipeline components...: 100%|█████████████████████████████████████████████████████| 7/7 [00:00<00:00, 9.50it/s]
>>> DEVICE: cuda
>>> PIPELINE TYPE: <class 'diffusers.pipelines.flux.pipeline_flux_kontext.FluxKontextPipeline'>
>>> Moving transformer to cuda using to_empty()
>>> WARNING: transformer.to_empty(cuda) failed: Module.to_empty() takes 1 positional argument but 2 were given
>>> Falling back to transformer.to(cuda)
>>> ERROR: transformer.to(cuda) also failed: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
>>> Moving vae to cuda using to_empty()
>>> WARNING: vae.to_empty(cuda) failed: Module.to_empty() takes 1 positional argument but 2 were given
>>> Falling back to vae.to(cuda)
>>> Moving text_encoder to cuda using to_empty()
>>> WARNING: text_encoder.to_empty(cuda) failed: Module.to_empty() takes 1 positional argument but 2 were given
>>> Falling back to text_encoder.to(cuda)
25-07-22 20:11:59 | I | Replacing fused Linear with ConcatLinear.
25-07-22 20:11:59 | I | + Replacing fused Linear in single_transformer_blocks.0 with ConcatLinear.
25-07-22 20:11:59 | I | - in_features = 3072/15360
25-07-22 20:11:59 | I | - out_features = 3072
25-07-22 20:11:59 | I | + Replacing fused Linear in single_transformer_blocks.1 with ConcatLinear.
25-07-22 20:11:59 | I | - in_features = 3072/15360
25-07-22 20:11:59 | I | - out_features = 3072
25-07-22 20:11:59 | I | + Replacing fused Linear in single_transformer_blocks.2 with ConcatLinear.
25-07-22 20:11:59 | I | - in_features = 3072/15360
25-07-22 20:11:59 | I | - out_features = 3072
```
2) KeyError: <class 'diffusers.models.transformers.transformer_flux.FluxAttention'>
https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/transformers/transformer_flux.py#L266
https://github.com/nunchaku-tech/deepcompressor/blob/main/deepcompressor/nn/struct/attn.py
https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-kontext-dev.py
https://github.com/nunchaku-tech/nunchaku/commit/b99fb8be615bc98c6915bbe06a1e0092cbc074a5
https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/flux/pipeline_flux_kontext.py
https://github.com/nunchaku-tech/deepcompressor/issues/91
---
# Dependencies
https://github.com/Dao-AILab/flash-attention
https://github.com/facebookresearch/xformers
https://github.com/openai/CLIP
https://github.com/THUDM/ImageReward
# Wheels
https://huggingface.co/datasets/siraxe/PrecompiledWheels_Torch-2.8-cu128-cp312
https://huggingface.co/lldacing/flash-attention-windows-wheel
|