File size: 7,983 Bytes
52baaab c22e483 52baaab 2266161 7cae33f 9e8dd87 84ccd75 f2a58b9 84ccd75 68d414b f062324 68d414b f062324 68d414b f062324 4acff3b 9b4f8e6 5bca1b8 9b4f8e6 d237588 5bca1b8 f807ff6 d237588 d350f56 8cbc120 c724a07 79a04cc c724a07 3dc5d63 c724a07 79a04cc d350f56 369d5ab 190d547 d66c08a 7cae33f cac27a3 190d547 cac27a3 0523cf8 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 |
---
license: apache-2.0
---
# Repos
https://github.com/mit-han-lab/deepcompressor
# Installation
https://github.com/mit-han-lab/deepcompressor/issues/56
https://github.com/nunchaku-tech/deepcompressor/issues/80
# Windows
https://learn.microsoft.com/en-us/windows/wsl/install
https://www.anaconda.com/docs/getting-started/miniconda/install
# Environment
python 3.10
cuda 12.8
torch 2.7
# Quantization
https://github.com/nunchaku-tech/deepcompressor/blob/main/examples/diffusion/README.md
Model Path: https://github.com/nunchaku-tech/deepcompressor/issues/70#issuecomment-2788155233
Save model: `--save-model true` or `--save-model /PATH/TO/CHECKPOINT/DIR`
Example: `python -m deepcompressor.app.diffusion.ptq examples/diffusion/configs/model/flux.1-dev.yaml examples/diffusion/configs/svdquant/nvfp4.yaml`
Folder Structure
- refer [black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)
- refer [black-forest-labs/FLUX.1-Kontext-dev](https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/tree/main)
---
# Blockers
1) NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
potential fix: app.diffusion.pipeline.config.py
```python
@staticmethod
def _default_build(
name: str,
path: str,
dtype: str | torch.dtype,
device: str | torch.device,
shift_activations: bool
) -> DiffusionPipeline:
if not path:
if name == "sdxl":
path = "stabilityai/stable-diffusion-xl-base-1.0"
elif name == "sdxl-turbo":
path = "stabilityai/sdxl-turbo"
elif name == "pixart-sigma":
path = "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS"
elif name == "flux.1-dev":
path = "black-forest-labs/FLUX.1-dev"
elif name == "flux.1-canny-dev":
path = "black-forest-labs/FLUX.1-Canny-dev"
elif name == "flux.1-depth-dev":
path = "black-forest-labs/FLUX.1-Depth-dev"
elif name == "flux.1-fill-dev":
path = "black-forest-labs/FLUX.1-Fill-dev"
elif name == "flux.1-schnell":
path = "black-forest-labs/FLUX.1-schnell"
else:
raise ValueError(f"Path for {name} is not specified.")
# Instantiate the pipeline
if name in ["flux.1-canny-dev", "flux.1-depth-dev"]:
pipeline = FluxControlPipeline.from_pretrained(path, torch_dtype=dtype)
elif name == "flux.1-fill-dev":
pipeline = FluxFillPipeline.from_pretrained(path, torch_dtype=dtype)
elif name.startswith("sana-"):
if dtype == torch.bfloat16:
pipeline = SanaPipeline.from_pretrained(
path, variant="bf16", torch_dtype=dtype, use_safetensors=True
)
pipeline.vae.to(dtype)
pipeline.text_encoder.to(dtype)
else:
pipeline = SanaPipeline.from_pretrained(path, torch_dtype=dtype)
else:
pipeline = AutoPipelineForText2Image.from_pretrained(path, torch_dtype=dtype)
# Debug output
print(">>> DEVICE:", device)
print(">>> PIPELINE TYPE:", type(pipeline))
# Try to move each component using .to_empty()
for name in ["unet", "transformer", "vae", "text_encoder"]:
module = getattr(pipeline, name, None)
if isinstance(module, torch.nn.Module):
try:
print(f">>> Moving {name} to {device} using to_empty()")
module.to_empty(device)
except Exception as e:
print(f">>> WARNING: {name}.to_empty({device}) failed: {e}")
try:
print(f">>> Falling back to {name}.to({device})")
module.to(device)
except Exception as ee:
print(f">>> ERROR: {name}.to({device}) also failed: {ee}")
# Identify main model (for patching)
model = getattr(pipeline, "unet", None) or getattr(pipeline, "transformer", None)
if model is not None:
replace_fused_linear_with_concat_linear(model)
replace_up_block_conv_with_concat_conv(model)
if shift_activations:
shift_input_activations(model)
else:
print(">>> WARNING: No model (unet/transformer) found for patching")
return pipeline
```
Debug Log
```
25-07-21 22:47:02 | I | === Start Evaluating ===
25-07-21 22:47:02 | I | * Building diffusion model pipeline
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 15.44it/s]
Loading pipeline components...: 57%|████████████████████████████████████████████████████████████████ | 4/7 [00:00<00:00, 9.47it/s]
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 7.79it/s]
>>> DEVICE: cuda
>>> PIPELINE TYPE: <class 'diffusers.pipelines.flux.pipeline_flux.FluxPipeline'>
>>> Moving transformer to cuda using to_empty()
>>> WARNING: transformer.to_empty(cuda) failed: Module.to_empty() takes 1 positional argument but 2 were given
>>> Falling back to transformer.to(cuda)
>>> ERROR: transformer.to(cuda) also failed: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
>>> Moving vae to cuda using to_empty()
>>> WARNING: vae.to_empty(cuda) failed: Module.to_empty() takes 1 positional argument but 2 were given
>>> Falling back to vae.to(cuda)
>>> Moving text_encoder to cuda using to_empty()
>>> WARNING: text_encoder.to_empty(cuda) failed: Module.to_empty() takes 1 positional argument but 2 were given
>>> Falling back to text_encoder.to(cuda)
25-07-21 22:47:05 | I | Replacing fused Linear with ConcatLinear.
25-07-21 22:47:05 | I | + Replacing fused Linear in single_transformer_blocks.0 with ConcatLinear.
25-07-21 22:47:05 | I | - in_features = 3072/15360
25-07-21 22:47:05 | I | - out_features = 3072
25-07-21 22:47:05 | I | + Replacing fused Linear in single_transformer_blocks.1 with ConcatLinear.
25-07-21 22:47:05 | I | - in_features = 3072/15360
25-07-21 22:47:05 | I | - out_features = 3072
25-07-21 22:47:05 | I | + Replacing fused Linear in single_transformer_blocks.2 with ConcatLinear.
25-07-21 22:47:05 | I | - in_features = 3072/15360
25-07-21 22:47:05 | I | - out_features = 3072
```
2) KeyError: <class 'diffusers.models.transformers.transformer_flux.FluxAttention'>
---
# Dependencies
https://github.com/Dao-AILab/flash-attention
https://github.com/facebookresearch/xformers
https://github.com/openai/CLIP
https://github.com/THUDM/ImageReward
# Wheels
https://huggingface.co/datasets/siraxe/PrecompiledWheels_Torch-2.8-cu128-cp312
https://huggingface.co/lldacing/flash-attention-windows-wheel
|