lym00
/

nunchaku_svdquant_deepcompressor_0.1.0_quantization_flux.1_kontext_dev_test

Model card Files Files and versions

xet

Community

lym00 commited on Jul 23

Commit

8587434

verified ·

1 Parent(s): 289862f

Update README.md

Browse files

Files changed (1) hide show

README.md +114 -5

README.md CHANGED Viewed

@@ -62,7 +62,7 @@ Model Path: https://github.com/nunchaku-tech/deepcompressor/issues/70#issuecomme
 Save model: `--save-model true` or `--save-model /PATH/TO/CHECKPOINT/DIR`
-Example: `python -m deepcompressor.app.diffusion.ptq svdq/flux.1-kontext-dev.yaml examples/diffusion/configs/svdquant/nvfp4.yaml --pipeline-path svdq/flux.1-kontext-dev/ --save-model ~/svdq/ --quant.calib.path datasets\torch.bfloat16\flux.1-kontext-dev\fmeuler8-g1\qdiff\s32`
 Model Files Structure
@@ -411,20 +411,129 @@ def collect(config: DiffusionPtqRunConfig, dataset: datasets.Dataset):
         caches.clear()
 ```
-References
-https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/transformers/transformer_flux.py#L266
-https://github.com/nunchaku-tech/deepcompressor/blob/main/deepcompressor/nn/struct/attn.py
 https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-kontext-dev.py
-https://github.com/nunchaku-tech/nunchaku/commit/b99fb8be615bc98c6915bbe06a1e0092cbc074a5
 https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/flux/pipeline_flux_kontext.py
 https://github.com/nunchaku-tech/deepcompressor/issues/91
 ---
 # Dependencies

 Save model: `--save-model true` or `--save-model /PATH/TO/CHECKPOINT/DIR`
+Example: `python -m deepcompressor.app.diffusion.ptq svdq/flux.1-kontext-dev.yaml examples/diffusion/configs/svdquant/nvfp4.yaml --pipeline-path svdq/flux.1-kontext-dev/ --save-model ~/svdq/`
 Model Files Structure
         caches.clear()
 ```
+4) RuntimeError: Tensor.item() cannot be called on meta tensors
+Potential Fix: deepcompressor.quantizer.impl.scale.py
+```python
+    def quantize(
+        self,
+        *,
+        # scale-based quantization related arguments
+        scale: torch.Tensor | None = None,
+        zero: torch.Tensor | None = None,
+        # range-based quantization related arguments
+        tensor: torch.Tensor | None = None,
+        dynamic_range: DynamicRange | None = None,
+    ) -> tuple[QuantScale, torch.Tensor]:
+        """Get the quantization scale and zero point of the tensor to be quantized.
+        Args:
+            scale (`torch.Tensor` or `None`, *optional*, defaults to `None`):
+                The scale tensor.
+            zero (`torch.Tensor` or `None`, *optional*, defaults to `None`):
+                The zero point tensor.
+            tensor (`torch.Tensor` or `None`, *optional*, defaults to `None`):
+                Ten tensor to be quantized. This is only used for range-based quantization.
+            dynamic_range (`DynamicRange` or `None`, *optional*, defaults to `None`):
+                The dynamic range of the tensor to be quantized.
+        Returns:
+            `tuple[QuantScale, torch.Tensor]`:
+                The scale and the zero point.
+        """
+        # region step 1: get the dynamic span for range-based scale or the scale tensor
+        if scale is None:
+            range_based = True
+            assert isinstance(tensor, torch.Tensor), "View tensor must be a tensor."
+            dynamic_range = dynamic_range or DynamicRange()
+            dynamic_range = dynamic_range.measure(
+                tensor.view(self.tensor_view_shape),
+                zero_domain=self.tensor_zero_domain,
+                is_float_point=self.tensor_quant_dtype.is_float_point,
+            )
+            dynamic_range = dynamic_range.intersect(self.tensor_range_bound)
+            dynamic_span = (dynamic_range.max - dynamic_range.min) if self.has_zero_point else dynamic_range.max
+        else:
+            range_based = False
+            scale = scale.view(self.scale_view_shapes[-1])
+            assert isinstance(scale, torch.Tensor), "Scale must be a tensor."
+        # endregion
+        # region step 2: get the scale
+        if self.linear_scale_quant_dtypes:
+            if range_based:
+                linear_scale = dynamic_span / self.linear_tensor_quant_span
+            elif self.exponent_scale_quant_dtypes:
+                linear_scale = scale.mul(self.exponent_tensor_quant_span).div(self.linear_tensor_quant_span)
+            else:
+                linear_scale = scale
+            lin_s = quantize_scale(
+                linear_scale,
+                quant_dtypes=self.linear_scale_quant_dtypes,
+                quant_spans=self.linear_scale_quant_spans,
+                view_shapes=self.linear_scale_view_shapes,
+            )
+            assert lin_s.data is not None, "Linear scale tensor is None."
+        if not lin_s.data.is_meta:
+            assert not lin_s.data.isnan().any(), "Linear scale tensor contains NaN."
+            assert not lin_s.data.isinf().any(), "Linear scale tensor contains Inf."
+        else:
+            lin_s = QuantScale()
+        if self.exponent_scale_quant_dtypes:
+            if range_based:
+                exp_scale = dynamic_span / self.exponent_tensor_quant_span
+            else:
+                exp_scale = scale
+            if lin_s.data is not None:
+                lin_s.data = lin_s.data.expand(self.linear_scale_view_shapes[-1]).reshape(self.scale_view_shapes[-1])
+                exp_scale = exp_scale / lin_s.data
+            exp_s = quantize_scale(
+                exp_scale,
+                quant_dtypes=self.exponent_scale_quant_dtypes,
+                quant_spans=self.exponent_scale_quant_spans,
+                view_shapes=self.exponent_scale_view_shapes,
+            )
+            assert exp_s.data is not None, "Exponential scale tensor is None."
+            assert not exp_s.data.isnan().any(), "Exponential scale tensor contains NaN."
+            assert not exp_s.data.isinf().any(), "Exponential scale tensor contains Inf."
+            s = exp_s if lin_s.data is None else lin_s.extend(exp_s)
+        else:
+            s = lin_s
+        assert s.data is not None, "Scale tensor is None."
+        assert not s.data.isnan().any(), "Scale tensor contains NaN."
+        assert not s.data.isinf().any(), "Scale tensor contains Inf."
+        # endregion
+        # region step 3: get the zero point
+        if self.has_zero_point:
+            if range_based:
+                if self.tensor_zero_domain == ZeroPointDomain.PreScale:
+                    zero = self.tensor_quant_range.min - dynamic_range.min / s.data
+                else:
+                    zero = self.tensor_quant_range.min * s.data - dynamic_range.min
+            assert isinstance(zero, torch.Tensor), "Zero point must be a tensor."
+            z = simple_quantize(zero, has_zero_point=True, quant_dtype=self.zero_quant_dtype)
+        else:
+            z = torch.tensor(0, dtype=s.data.dtype, device=s.data.device)
+        assert not z.isnan().any(), "Zero point tensor contains NaN."
+        assert not z.isinf().any(), "Zero point tensor contains Inf."
+        # endregion
+        return s, z
+```
+References
+https://github.com/nunchaku-tech/nunchaku/commit/b99fb8be615bc98c6915bbe06a1e0092cbc074a5
 https://github.com/nunchaku-tech/nunchaku/blob/main/examples/flux.1-kontext-dev.py
+https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/transformers/transformer_flux.py#L266
 https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/flux/pipeline_flux_kontext.py
 https://github.com/nunchaku-tech/deepcompressor/issues/91
+https://deepwiki.com/nunchaku-tech/deepcompressor
 ---
 # Dependencies