LeanQuant commited on 24 days ago

Commit

45ec394

verified ·

1 Parent(s): b51a631

Add files using upload-large-folder tool

Browse files

Files changed (43) hide show

README.md +135 -0
blocks_0.safetensors +3 -0
blocks_1.safetensors +3 -0
blocks_10.safetensors +3 -0
blocks_11.safetensors +3 -0
blocks_12.safetensors +3 -0
blocks_13.safetensors +3 -0
blocks_14.safetensors +3 -0
blocks_15.safetensors +3 -0
blocks_16.safetensors +3 -0
blocks_17.safetensors +3 -0
blocks_18.safetensors +3 -0
blocks_19.safetensors +3 -0
blocks_2.safetensors +3 -0
blocks_20.safetensors +3 -0
blocks_21.safetensors +3 -0
blocks_22.safetensors +3 -0
blocks_23.safetensors +3 -0
blocks_24.safetensors +3 -0
blocks_25.safetensors +3 -0
blocks_26.safetensors +3 -0
blocks_27.safetensors +3 -0
blocks_28.safetensors +3 -0
blocks_29.safetensors +3 -0
blocks_3.safetensors +3 -0
blocks_30.safetensors +3 -0
blocks_31.safetensors +3 -0
blocks_32.safetensors +3 -0
blocks_33.safetensors +3 -0
blocks_34.safetensors +3 -0
blocks_35.safetensors +3 -0
blocks_36.safetensors +3 -0
blocks_37.safetensors +3 -0
blocks_38.safetensors +3 -0
blocks_39.safetensors +3 -0
blocks_4.safetensors +3 -0
blocks_5.safetensors +3 -0
blocks_6.safetensors +3 -0
blocks_7.safetensors +3 -0
blocks_8.safetensors +3 -0
blocks_9.safetensors +3 -0
config.json +24 -0
diffusion_pytorch_model.safetensors +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,135 @@

+---
+base_model:
+  - Wan-AI/Wan2.2-I2V-A14B-Diffusers
+base_model_relation: quantized
+pipeline_tag: image-to-video
+tags:
+- dfloat11
+- df11
+- lossless compression
+- 70% size, 100% accuracy
+---
+# DFloat11 Compressed Model: `Wan-AI/Wan2.2-I2V-A14B`
+This is a **DFloat11 losslessly compressed** version of the original `Wan-AI/Wan2.2-I2V-A14B` model. It reduces model size by **32%** compared to the original BFloat16 model, while maintaining **bit-identical outputs** and supporting **efficient GPU inference**.
+🔥🔥🔥 Thanks to DFloat11 compression, `Wan-AI/Wan2.2-I2V-A14B` can now generate a 5-second 720P video on a single 24GB GPU, while maintaining full model quality. 🔥🔥🔥
+### 📊 Performance Comparison
+| Model                                              | Model Size         | Peak GPU Memory (5-second 720P generation)   | Generation Time (A100 GPU) |
+|----------------------------------------------------|--------------------|----------------------------------------------|----------------------------|
+| Wan-AI/Wan2.2-I2V-A14B (BFloat16)                  | ~56 GB             | O.O.M.                                       | -                          |
+| Wan-AI/Wan2.2-I2V-A14B (DFloat11)                  | 19.47 + 19.44 GB   | 29.12 GB                                     | 42 minutes                 |
+| Wan-AI/Wan2.2-I2V-A14B (DFloat11 + CPU Offloading) | 19.47 + 19.44 GB   | 20.01 GB                                     | 44 minutes                 |
+### 🔍 How It Works
+We apply Huffman coding to the exponent bits of BFloat16 model weights, which are highly compressible. We leverage hardware-aware algorithmic designs to enable highly efficient, on-the-fly weight decompression directly on the GPU. Find out more in our [research paper](https://arxiv.org/abs/2504.11651).
+### 🔧 How to Use
+1. Install or upgrade the DFloat11 pip package *(installs the CUDA kernel automatically; requires a CUDA-compatible GPU and PyTorch installed)*:
+    ```bash
+    pip install -U dfloat11[cuda12]
+    ```
+2. Install the latest `diffusers` package from source:
+    ```bash
+    pip install git+https://github.com/huggingface/diffusers
+    ```
+3. Save the following code to a Python file `i2v.py`:
+    ```python
+    import time
+    import torch
+    import numpy as np
+    import argparse
+    from diffusers import WanImageToVideoPipeline
+    from diffusers.utils import export_to_video, load_image
+    from dfloat11 import DFloat11Model
+    parser = argparse.ArgumentParser(description='Image to Video generation using Wan2.2-I2V model')
+    parser.add_argument('--cpu_offload', action='store_true', help='Enable CPU offloading')
+    parser.add_argument('--image_path', type=str, default="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/wan_i2v_input.JPG", help='Path or URL to the input image')
+    parser.add_argument('--width', type=int, default=1280, help='Output video width')
+    parser.add_argument('--height', type=int, default=720, help='Output video height')
+    parser.add_argument('--prompt', type=str, default="Summer beach vacation style, a white cat wearing sunglasses sits on a surfboard. The fluffy-furred feline gazes directly at the camera with a relaxed expression. Blurred beach scenery forms the background featuring crystal-clear waters, distant green hills, and a blue sky dotted with white clouds. The cat assumes a naturally relaxed posture, as if savoring the sea breeze and warm sunlight. A close-up shot highlights the feline's intricate details and the refreshing atmosphere of the seaside.", help='Prompt for video generation')
+    parser.add_argument('--negative_prompt', type=str, default="色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走", help='Negative prompt for video generation')
+    parser.add_argument('--num_frames', type=int, default=81, help='Number of frames to generate')
+    parser.add_argument('--guidance_scale', type=float, default=3.5, help='Guidance scale for generation')
+    parser.add_argument('--num_inference_steps', type=int, default=40, help='Number of inference steps')
+    parser.add_argument('--seed', type=int, default=42, help='Random seed for generation')
+    parser.add_argument('--output', type=str, default='i2v_output.mp4', help='Output video path')
+    parser.add_argument('--fps', type=int, default=16, help='FPS of output video')
+    args = parser.parse_args()
+    image = load_image(args.image_path)
+    pipe = WanImageToVideoPipeline.from_pretrained("Wan-AI/Wan2.2-I2V-A14B-Diffusers", torch_dtype=torch.bfloat16)
+    DFloat11Model.from_pretrained(
+        "DFloat11/Wan2.2-I2V-A14B-DF11",
+        device="cpu",
+        cpu_offload=args.cpu_offload,
+        bfloat16_model=pipe.transformer,
+    )
+    DFloat11Model.from_pretrained(
+        "DFloat11/Wan2.2-I2V-A14B-2-DF11",
+        device="cpu",
+        cpu_offload=args.cpu_offload,
+        bfloat16_model=pipe.transformer_2,
+    )
+    pipe.enable_model_cpu_offload()
+    max_area = args.width * args.height
+    aspect_ratio = image.height / image.width
+    mod_value = pipe.vae_scale_factor_spatial * pipe.transformer.config.patch_size[1]
+    height = round(np.sqrt(max_area * aspect_ratio)) // mod_value * mod_value
+    width = round(np.sqrt(max_area / aspect_ratio)) // mod_value * mod_value
+    image = image.resize((width, height))
+    generator = torch.Generator(device="cuda").manual_seed(args.seed)
+    start_time = time.time()
+    output = pipe(
+        image=image,
+        prompt=args.prompt,
+        negative_prompt=args.negative_prompt,
+        height=height,
+        width=width,
+        num_frames=args.num_frames,
+        guidance_scale=args.guidance_scale,
+        num_inference_steps=args.num_inference_steps,
+        generator=generator,
+    ).frames[0]
+    print(f"Time taken: {time.time() - start_time:.2f} seconds")
+    export_to_video(output, args.output, fps=args.fps)
+    max_memory = torch.cuda.max_memory_allocated()
+    print(f"Max memory: {max_memory / (1000 ** 3):.2f} GB")
+    ```
+4. To run without CPU offloading (40GB VRAM required):
+    ```bash
+    PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True python i2v.py
+    ```
+    To run with CPU offloading (22.5GB VRAM required):
+    ```bash
+    PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True python i2v.py --cpu_offload
+    ```
+    > Setting `PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True` is strongly recommended to prevent out-of-memory errors caused by GPU memory fragmentation.
+### 📄 Learn More
+* **Paper**: [70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float](https://arxiv.org/abs/2504.11651)
+* **GitHub**: [https://github.com/LeanModels/DFloat11](https://github.com/LeanModels/DFloat11)
+* **HuggingFace**: [https://huggingface.co/DFloat11](https://huggingface.co/DFloat11)

blocks_0.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a0da8ca4728d792a5751981ed2292bc98914ba9383b9154ab9317def311d228f
+size 471830670

blocks_1.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6b6e4c4bd8c44891825a69103eea7afb5be76fbf9aa0629af2bfac2e0d0fa300
+size 474203180

blocks_10.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:bd38a0e7c75d3a2db6f003018c6de0e1964f6ad498542d56c91fb25a6253ec32
+size 475010549

blocks_11.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e61b6b90c5033fc56dec7a6655742355279689ff997aba37448b40b5f509c05d
+size 474910021

blocks_12.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6afc8f27a0b955c48b634199f027fabd95ac7682d1c68b847e18c8de884ee163
+size 474813810

blocks_13.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:cded22aea92584f349f87a51a0b41f92266016f54857f77c60163e2fadc96c73
+size 474632423

blocks_14.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1ab9ba112a7b58edd7526e8c75ea88ca07e0ed12144d11148d551cf5c8f8da69
+size 474546787

blocks_15.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c4db2b711702d5e8f557d5ad302ec88438cb2cd5576c97d1f6e8de6db01cd6ee
+size 474310710

blocks_16.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6018a945d7193fb7b66ac53cd2e990045afc5b927b7a5aa50738403ed27e5365
+size 474338329

blocks_17.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c11303fe7e33efbe98034dac57e040d1adc0b84c54d91a4911ef501cc8da7a02
+size 473914359

blocks_18.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1254f5d6be9aa4c636a34d98f774d4c912cb685f36fcc4ce7353e8608bf584c4
+size 473988258

blocks_19.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6fd9838397ce2523ccb903e977ca96480b3d63edafa84fc3ba60284316b28d04
+size 473466770

blocks_2.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4662f88428d7fcd76ede8b6fd7b9b31ff5d03f973881cb86bb6817669c9c11a9
+size 474559972

blocks_20.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:651e283f3be5a0a69ebc099b8d781ea6d799698b2334e1d97783e8adde3b67d1
+size 473285411

blocks_21.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c9ec30d7ed7b80fac5dc17ce59e4f14c18e15466533ea73bd37cd74a1398656f
+size 473358331

blocks_22.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1683978b9092fb90e6c9a99d764804f7167d7ca8d11b5a722f7ad4275a602a95
+size 473014488

blocks_23.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:b4fa5d867c13e506918fdb8bb10f4f3dde788f137404c0e586cb0e3a01e8cbc7
+size 472983831

blocks_24.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7b1f76b7e5dfa8cd7a7265e0d6826066eed31a260db4721e1c769ca39fc3633a
+size 472682504

blocks_25.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:37dd3f48df5794f3aadac5133683fb3a2efbed62b03319893f334013f50daf25
+size 472510731

blocks_26.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a99e86ae6e3bde86de67ae453a6bc26d8f16b33bf93ffb1015070c8a4f4e22ab
+size 472336812

blocks_27.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a40e95c7753adc2e56c42d928d9a66af6c097494c5f248250c121cf8b76c6713
+size 472148991

blocks_28.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:860886e6be9134b954472f571f89ed48f5520debbf557ff26f8491c249d81836
+size 472029862

blocks_29.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3e2f65b95eb9b86381fb9d17bdd9b5b7941c0aabbfdd8627f4c50c0e5eb62653
+size 471601609

blocks_3.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:12a8c8d4176e8b15d6e9739c0e72c28d0217a6d0887a85bcbc6db08d1c8fdeb0
+size 474831776

blocks_30.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:01c50c42b8997a5f32d33f86d72b87137dab6ea98efde013e2ed0e1012b76381
+size 471518310

blocks_31.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:44330abbdbe054cd2a86785c845e85e78da83532dfe6bc6c084f1cee1578c5ba
+size 471269801

blocks_32.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a8d710cd4f7cd80772abc33e59fd554291e2a006ef21067325386dc8743ffd8c
+size 470945016

blocks_33.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4313464ab8e93510405a47ec91a294e81e022dd5166e1c500a0999698a708ee6
+size 470460827

blocks_34.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2810133c47a2b3709ddeea2fa9160c829c40958de2cb5eeba04202d22fd93742
+size 470559982

blocks_35.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:4bd2a92850bbcacf9839e9e4155ffefc32602145dabf302cb20c52f6b8a07798
+size 470995787

blocks_36.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:53eb46779fc1135911f35eb39e6bd94cd9f6e109b8e0cde40882759c6d7fef8c
+size 470444427

blocks_37.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:5c68c7b32ecafc6ae36ff26c0d89c38c2ea34dff662e4f31601b291498a8d045
+size 470262471

blocks_38.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:28c28b4b90c7e1eacf4f735693a072a09c9dfd08510a3d6980ec01a065d721fd
+size 469375741

blocks_39.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:64b262baa60aa71e169b2c3368dbb40be395a07a6b8aee49b13f7995f2c1df01
+size 469061057

blocks_4.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0f94105ce002b314eee18104a2e4dc0b877fc5800a4252ea53d92f76d114a3ce
+size 475003519

blocks_5.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:44e8e45f5a46afe5a840301e6c785f583b7542af0edd2a79e605cba483246755
+size 475137978

blocks_6.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:264dc95fb3bc4a830a099585f82c83a3474318c7523f8139a016a40f8611e65c
+size 475220567

blocks_7.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9220b84d6d1f9a4e74171764ac6e356a8d844aba8d56062c50b31ecd7c86797f
+size 475288108

blocks_8.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:656c253c2704c059aab22a43d925c1e39554611d928f10d24a237d20879dea34
+size 475272045

blocks_9.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:d247970dc1808e254905e98a0a77afa308d732046a45c69c7136cbecdef8b398
+size 475168901

config.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "dfloat11_config": {
+    "bytes_per_thread": 8,
+    "pattern_dict": {
+      "blocks\\.\\d+": [
+        "attn1.to_q",
+        "attn1.to_k",
+        "attn1.to_v",
+        "attn1.to_out.0",
+        "attn2.to_q",
+        "attn2.to_k",
+        "attn2.to_v",
+        "attn2.to_out.0",
+        "ffn.net.0.proj",
+        "ffn.net.2"
+      ]
+    },
+    "threads_per_block": [
+      512
+    ],
+    "version": "0.3.1"
+  },
+  "model_type": "llama"
+}

diffusion_pytorch_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:7f651271fa00c2af6b5a3a85356616427377e732c23a6a2b7b1e2a72b98ef39b
+size 521350912