lym00's picture
Update README.md
02469b3 verified
|
raw
history blame
8.04 kB
metadata
license: apache-2.0

Repos

https://github.com/mit-han-lab/deepcompressor

Installation

https://github.com/mit-han-lab/deepcompressor/issues/56

https://github.com/nunchaku-tech/deepcompressor/issues/80

Windows

https://learn.microsoft.com/en-us/windows/wsl/install

https://www.anaconda.com/docs/getting-started/miniconda/install

Environment

python 3.10

cuda 12.8

torch 2.7

diffusers https://github.com/huggingface/diffusers

transformers 4.51

Quantization

https://github.com/nunchaku-tech/deepcompressor/blob/main/examples/diffusion/README.md

Model Path: https://github.com/nunchaku-tech/deepcompressor/issues/70#issuecomment-2788155233

Save model: --save-model true or --save-model /PATH/TO/CHECKPOINT/DIR

Example: python -m deepcompressor.app.diffusion.ptq examples/diffusion/configs/model/flux.1-dev.yaml examples/diffusion/configs/svdquant/nvfp4.yaml

Folder Structure


Blockers

  1. NotImplementedError: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.

potential fix: app.diffusion.pipeline.config.py

    @staticmethod
    def _default_build(
        name: str, path: str, dtype: str | torch.dtype, device: str | torch.device, shift_activations: bool
    ) -> DiffusionPipeline:
        if not path:
            if name == "sdxl":
                path = "stabilityai/stable-diffusion-xl-base-1.0"
            elif name == "sdxl-turbo":
                path = "stabilityai/sdxl-turbo"
            elif name == "pixart-sigma":
                path = "PixArt-alpha/PixArt-Sigma-XL-2-1024-MS"
            elif name == "flux.1-kontext-dev":
                path = "black-forest-labs/FLUX.1-Kontext-dev"
            elif name == "flux.1-dev":
                path = "black-forest-labs/FLUX.1-dev"
            elif name == "flux.1-canny-dev":
                path = "black-forest-labs/FLUX.1-Canny-dev"
            elif name == "flux.1-depth-dev":
                path = "black-forest-labs/FLUX.1-Depth-dev"
            elif name == "flux.1-fill-dev":
                path = "black-forest-labs/FLUX.1-Fill-dev"
            elif name == "flux.1-schnell":
                path = "black-forest-labs/FLUX.1-schnell"
            else:
                raise ValueError(f"Path for {name} is not specified.")
        if name in ["flux.1-kontext-dev"]:
            pipeline = FluxKontextPipeline.from_pretrained(path, torch_dtype=dtype)
        elif name in ["flux.1-canny-dev", "flux.1-depth-dev"]:
            pipeline = FluxControlPipeline.from_pretrained(path, torch_dtype=dtype)
        elif name == "flux.1-fill-dev":
            pipeline = FluxFillPipeline.from_pretrained(path, torch_dtype=dtype)
        elif name.startswith("sana-"):
            if dtype == torch.bfloat16:
                pipeline = SanaPipeline.from_pretrained(path, variant="bf16", torch_dtype=dtype, use_safetensors=True)
                pipeline.vae.to(dtype)
                pipeline.text_encoder.to(dtype)
            else:
                pipeline = SanaPipeline.from_pretrained(path, torch_dtype=dtype)
        else:
            pipeline = AutoPipelineForText2Image.from_pretrained(path, torch_dtype=dtype)

        # Debug output
        print(">>> DEVICE:", device)
        print(">>> PIPELINE TYPE:", type(pipeline))
    
        # Try to move each component using .to_empty()
        for name in ["unet", "transformer", "vae", "text_encoder"]:
            module = getattr(pipeline, name, None)
            if isinstance(module, torch.nn.Module):
                try:
                    print(f">>> Moving {name} to {device} using to_empty()")
                    module.to_empty(device)
                except Exception as e:
                    print(f">>> WARNING: {name}.to_empty({device}) failed: {e}")
                    try:
                        print(f">>> Falling back to {name}.to({device})")
                        module.to(device)
                    except Exception as ee:
                        print(f">>> ERROR: {name}.to({device}) also failed: {ee}")
    
        # Identify main model (for patching)
        model = getattr(pipeline, "unet", None) or getattr(pipeline, "transformer", None)
        if model is not None:
            replace_fused_linear_with_concat_linear(model)
            replace_up_block_conv_with_concat_conv(model)
            if shift_activations:
                shift_input_activations(model)
        else:
            print(">>> WARNING: No model (unet/transformer) found for patching")
    
        return pipeline

Debug Log

25-07-22 20:11:56 | I | === Start Evaluating ===
25-07-22 20:11:56 | I | * Building diffusion model pipeline
Loading pipeline components...:   0%|                                                             | 0/7 [00:00<?, ?it/s]
You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 18.92it/s]
Loading pipeline components...: 100%|█████████████████████████████████████████████████████| 7/7 [00:00<00:00,  9.50it/s]
>>> DEVICE: cuda
>>> PIPELINE TYPE: <class 'diffusers.pipelines.flux.pipeline_flux_kontext.FluxKontextPipeline'>
>>> Moving transformer to cuda using to_empty()
>>> WARNING: transformer.to_empty(cuda) failed: Module.to_empty() takes 1 positional argument but 2 were given
>>> Falling back to transformer.to(cuda)
>>> ERROR: transformer.to(cuda) also failed: Cannot copy out of meta tensor; no data! Please use torch.nn.Module.to_empty() instead of torch.nn.Module.to() when moving module from meta to a different device.
>>> Moving vae to cuda using to_empty()
>>> WARNING: vae.to_empty(cuda) failed: Module.to_empty() takes 1 positional argument but 2 were given
>>> Falling back to vae.to(cuda)
>>> Moving text_encoder to cuda using to_empty()
>>> WARNING: text_encoder.to_empty(cuda) failed: Module.to_empty() takes 1 positional argument but 2 were given
>>> Falling back to text_encoder.to(cuda)
25-07-22 20:11:59 | I |   Replacing fused Linear with ConcatLinear.
25-07-22 20:11:59 | I |     + Replacing fused Linear in single_transformer_blocks.0 with ConcatLinear.
25-07-22 20:11:59 | I |       - in_features = 3072/15360
25-07-22 20:11:59 | I |       - out_features = 3072
25-07-22 20:11:59 | I |     + Replacing fused Linear in single_transformer_blocks.1 with ConcatLinear.
25-07-22 20:11:59 | I |       - in_features = 3072/15360
25-07-22 20:11:59 | I |       - out_features = 3072
25-07-22 20:11:59 | I |     + Replacing fused Linear in single_transformer_blocks.2 with ConcatLinear.
25-07-22 20:11:59 | I |       - in_features = 3072/15360
25-07-22 20:11:59 | I |       - out_features = 3072
  1. KeyError: <class 'diffusers.models.transformers.transformer_flux.FluxAttention'>

https://github.com/nunchaku-tech/deepcompressor/blob/main/deepcompressor/nn/struct/attn.py

https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-kontext-dev.py

https://github.com/nunchaku-tech/nunchaku/commit/b99fb8be615bc98c6915bbe06a1e0092cbc074a5

https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/flux/pipeline_flux_kontext.py


Dependencies

https://github.com/Dao-AILab/flash-attention

https://github.com/facebookresearch/xformers

https://github.com/openai/CLIP

https://github.com/THUDM/ImageReward

Wheels

https://huggingface.co/datasets/siraxe/PrecompiledWheels_Torch-2.8-cu128-cp312

https://huggingface.co/lldacing/flash-attention-windows-wheel