LeanQuant commited on
Commit
466f022
·
verified ·
1 Parent(s): 54c02d5

Add files using upload-large-folder tool

Browse files
README.md ADDED
@@ -0,0 +1,128 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Wan-AI/Wan2.2-T2V-A14B
4
+ base_model_relation: quantized
5
+ pipeline_tag: text-to-video
6
+ tags:
7
+ - dfloat11
8
+ - df11
9
+ - lossless compression
10
+ - 70% size, 100% accuracy
11
+ ---
12
+
13
+ # DFloat11 Compressed Model: `Wan-AI/Wan2.2-T2V-A14B`
14
+
15
+ This is a **DFloat11 losslessly compressed** version of the original `Wan-AI/Wan2.2-T2V-A14B` model. It reduces model size by **32%** compared to the original BFloat16 model, while maintaining **bit-identical outputs** and supporting **efficient GPU inference**.
16
+
17
+ 🔥🔥🔥 Thanks to DFloat11 compression, `Wan-AI/Wan2.2-T2V-A14B` can now generate a 5-second 720P video on a single 24GB GPU, while maintaining full model quality. 🔥🔥🔥
18
+
19
+ ### 📊 Performance Comparison
20
+
21
+ | Model | Model Size | Peak GPU Memory (5-second 720P generation) | Generation Time (A100 GPU) |
22
+ |----------------------------------------------------|--------------------|----------------------------------------------|----------------------------|
23
+ | Wan-AI/Wan2.2-T2V-A14B (BFloat16) | ~56 GB | O.O.M. | - |
24
+ | Wan-AI/Wan2.2-T2V-A14B (DFloat11) | 19.46 + 19.39 GB | 41.06 GB | 42 minutes |
25
+ | Wan-AI/Wan2.2-T2V-A14B (DFloat11 + CPU Offloading) | 19.46 + 19.39 GB | 22.49 GB | 44 minutes |
26
+
27
+ ### 🔍 How It Works
28
+
29
+ We apply Huffman coding to the exponent bits of BFloat16 model weights, which are highly compressible. We leverage hardware-aware algorithmic designs to enable highly efficient, on-the-fly weight decompression directly on the GPU. Find out more in our [research paper](https://arxiv.org/abs/2504.11651).
30
+
31
+ ### 🔧 How to Use
32
+
33
+ 1. Install or upgrade the DFloat11 pip package *(installs the CUDA kernel automatically; requires a CUDA-compatible GPU and PyTorch installed)*:
34
+
35
+ ```bash
36
+ pip install -U dfloat11[cuda12]
37
+ ```
38
+
39
+ 2. Install the latest `diffusers` package from source:
40
+
41
+ ```bash
42
+ pip install git+https://github.com/huggingface/diffusers
43
+ ```
44
+
45
+ 3. Save the following code to a Python file `t2v.py`:
46
+
47
+ ```python
48
+ import time
49
+ import torch
50
+ import argparse
51
+ from diffusers import WanPipeline, AutoencoderKLWan
52
+ from diffusers.utils import export_to_video
53
+ from dfloat11 import DFloat11Model
54
+
55
+ # Set up argument parser
56
+ parser = argparse.ArgumentParser(description='Run Wan2.2 T2V model with custom parameters')
57
+ parser.add_argument('--prompt', type=str, default="A serene koi pond at night, with glowing lanterns reflecting on the rippling water. Ethereal fireflies dance above as cherry blossoms gently fall, creating a dreamlike atmosphere.",
58
+ help='Text prompt for video generation')
59
+ parser.add_argument('--negative_prompt', type=str, default="色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走",
60
+ help='Negative prompt for video generation')
61
+ parser.add_argument('--width', type=int, default=1280, help='Width of output video')
62
+ parser.add_argument('--height', type=int, default=720, help='Height of output video')
63
+ parser.add_argument('--num_frames', type=int, default=81, help='Number of frames to generate')
64
+ parser.add_argument('--guidance_scale', type=float, default=4.0, help='Guidance scale for first stage')
65
+ parser.add_argument('--guidance_scale_2', type=float, default=3.0, help='Guidance scale for second stage')
66
+ parser.add_argument('--num_inference_steps', type=int, default=40, help='Number of inference steps')
67
+ parser.add_argument('--cpu_offload', action='store_true', help='Enable CPU offloading')
68
+ parser.add_argument('--output', type=str, default='t2v_out.mp4', help='Output video file path')
69
+ parser.add_argument('--fps', type=int, default=16, help='FPS of output video')
70
+
71
+ args = parser.parse_args()
72
+
73
+ # Initialize models
74
+ vae = AutoencoderKLWan.from_pretrained("Wan-AI/Wan2.2-T2V-A14B-Diffusers", subfolder="vae", torch_dtype=torch.float32)
75
+ pipe = WanPipeline.from_pretrained("Wan-AI/Wan2.2-T2V-A14B-Diffusers", vae=vae, torch_dtype=torch.bfloat16)
76
+
77
+ # Load DFloat11 models
78
+ DFloat11Model.from_pretrained(
79
+ "DFloat11/Wan2.2-T2V-A14B-DF11",
80
+ device="cpu",
81
+ cpu_offload=args.cpu_offload,
82
+ bfloat16_model=pipe.transformer,
83
+ )
84
+ DFloat11Model.from_pretrained(
85
+ "DFloat11/Wan2.2-T2V-A14B-2-DF11",
86
+ device="cpu",
87
+ cpu_offload=args.cpu_offload,
88
+ bfloat16_model=pipe.transformer_2,
89
+ )
90
+
91
+ pipe.enable_model_cpu_offload()
92
+
93
+ start_time = time.time()
94
+ # Generate video
95
+ output = pipe(
96
+ prompt=args.prompt,
97
+ negative_prompt=args.negative_prompt,
98
+ height=args.height,
99
+ width=args.width,
100
+ num_frames=args.num_frames,
101
+ guidance_scale=args.guidance_scale,
102
+ guidance_scale_2=args.guidance_scale_2,
103
+ num_inference_steps=args.num_inference_steps,
104
+ ).frames[0]
105
+ print(f"Time taken: {time.time() - start_time:.2f} seconds")
106
+
107
+ export_to_video(output, args.output, fps=args.fps)
108
+
109
+ # Print memory usage
110
+ max_memory = torch.cuda.max_memory_allocated()
111
+ print(f"Max memory: {max_memory / (1000 ** 3):.2f} GB")
112
+ ```
113
+
114
+ 4. To run without CPU offloading (40GB VRAM required):
115
+ ```bash
116
+ python t2v.py
117
+ ```
118
+
119
+ To run with CPU offloading (22.5GB VRAM required):
120
+ ```bash
121
+ python t2v.py --cpu_offload
122
+ ```
123
+
124
+ ### 📄 Learn More
125
+
126
+ * **Paper**: [70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float](https://arxiv.org/abs/2504.11651)
127
+ * **GitHub**: [https://github.com/LeanModels/DFloat11](https://github.com/LeanModels/DFloat11)
128
+ * **HuggingFace**: [https://huggingface.co/DFloat11](https://huggingface.co/DFloat11)
blocks_0.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:60a9503666d36428f691a1820f4798e727935635ec1e6abe3b81536c2d24bf76
3
+ size 472793801
blocks_1.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ea85936c45f3995f6a50d1f066f5ee0c1f9788e3b570ef245a03e6fae44456c9
3
+ size 474408424
blocks_10.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:293e9c1644511b04edfea28189bd6140fe66977107989e2498cc88451020abdb
3
+ size 475233533
blocks_11.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a3ba217c17ea4e62aa4963d08966eba60718c88adaf1c8943516031d045ca6f
3
+ size 475118920
blocks_12.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:86682f06c67c87d80daa3f9528e115a21db3bf1e0d1d17ec5938a255c49b7b67
3
+ size 474963079
blocks_13.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:491805456f08a54679337ccd5615a465670ce8ff28ec53a2692c95ae4e709ed0
3
+ size 474753276
blocks_14.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:425579cb4d3d4a9c0ccd3ce99cd16813e3bf8595b30d562485ec41ac21fe489c
3
+ size 474618112
blocks_15.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8cc89421c92a1b2b9f15b9f5f3247a116be9d0a3f004a594930ab290c3b9e804
3
+ size 474352944
blocks_16.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c15bac483d82afd3ec7892512e8f03fa8624053d9c27c7eaee34611affbdbf4c
3
+ size 474443492
blocks_17.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:34ad7165b97f31b80388f2c78226fd2d1a677090afe27b9e120fd5c9d5b0241a
3
+ size 474243679
blocks_18.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0d13bc3e2f45e0f4592441b852119ab01484856495ff5a22f389eaaab2b31c5c
3
+ size 474412119
blocks_19.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bdb6c95c1aa13d5be2887c77e121f582d7c980d1650dc56904c24d4aa8d1517e
3
+ size 474082263
blocks_2.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c76cccba2a23ea8c3de719772c1b2a43cfd3e9eb02c96b8eb65a4fedbb33237f
3
+ size 474811243
blocks_20.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d7068c91d2251ee2d596618cb99aa16f60ace5ac415d3e85ab69a8f9a1504bef
3
+ size 473915883
blocks_21.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:678021d42e2e6a7e84a1d6dcae0cec67b48512d9dfde306e77e933a10a22d3aa
3
+ size 473899109
blocks_22.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:df39b4e405bcfee6d71847535148a2f2ac3e6191999e43fa792b822dcad9394a
3
+ size 473588753
blocks_23.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:01edf304f6e1031780f90e2eca08c706d6cb7e4a5b8a9971e238f1998a553431
3
+ size 473518168
blocks_24.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:21adedafc588ca637a845f3e738d3963f2be6ba752e18c4bac1b3ea1e4a19696
3
+ size 473084002
blocks_25.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8dfb0543e551c5134da865b098b09424113451f38ee99eb96522e15ca87bc52a
3
+ size 472838962
blocks_26.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e5757ae4baa024d0455705e7682d28158ed9ef5b919506b30dbbb8afd8eba31a
3
+ size 472698803
blocks_27.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b00e7cfa1f463308adf2c38f3b3879816c7b8b26af74af6e011e0819039cc6bf
3
+ size 472585399
blocks_28.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b4c0ab7c95af7fac65cf78b8f75201bdb25bc080fa9fb2ba119b2428fa88e877
3
+ size 472466417
blocks_29.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1f12762305003c40d6961846afa5ac951dc3cf813de88a5735efb3695268be92
3
+ size 472234992
blocks_3.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:49978c3d6c2e77731909f584360182dc5cc7bb2f9ec24de9c0e3fd1013d121a5
3
+ size 475082088
blocks_30.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:182e80328abde9b6d37fc8594b92bbadfd271093a7992d358f3b974ed1cc9373
3
+ size 472128713
blocks_31.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fe33bce521719e4bb1caae2e61315f4390890f03f69b110f6af8fcadb6592d97
3
+ size 471971038
blocks_32.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1f2f2070d58ff3d0e7aa590af4579b17ccde9622dc2bb725a426848db21eb84b
3
+ size 471576162
blocks_33.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:080e901eb1a82c79cc204b5c0a5afb95533462c59627beaf84f83adea390672b
3
+ size 470937481
blocks_34.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:504e606f05ed8c820d1edf80df5bcf03b1ab17efef29e1c22a480f52f43cef48
3
+ size 470987119
blocks_35.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:018c4f100d9481dd05c6389436b2aed32d91589226f329777bf3514af8a20f01
3
+ size 471382723
blocks_36.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e93a6f783748f353766760464898bb30322574a0f6a5376ef9d3077817bdc38
3
+ size 470828082
blocks_37.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1a6e955e121f1206dfb4f7775fd135469700e939454cc91a20be4742f0c97ac2
3
+ size 470736432
blocks_38.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e08565adcbc1edafd967e5df2249ec9aafc69f4c33bc4ad6e615c1f30cbadd4a
3
+ size 469982700
blocks_39.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:308cfe55c5f27db6d25a69c40c706c4353c30b9bde510a528124bd43f28417e0
3
+ size 469731656
blocks_4.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:082bc621faaf0ae21f8d27ec5633fa6bdf5e4688c9ddd832028f951f21f77281
3
+ size 475251559
blocks_5.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:91f84dcd5540627890682063992f7cd8df4742130b9bbf43bca95d169e29db90
3
+ size 475327083
blocks_6.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bd573b21903ff060ff889d23d806b663a7bacb73653eefe2a5da76e4bbd738f0
3
+ size 475426670
blocks_7.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:214d2efcde62991d4e1cc2ac152edbb3cb9392af03b9393283648b941fd63c26
3
+ size 475477011
blocks_8.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a1005e17381ffe23319f7dae49ba52a91c8029333172f4e8f88a05cd353f9965
3
+ size 475499649
blocks_9.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9eb73a6d353cf34485b1391284e0908bc18dda9e8c4aa98d1a0ea41ac605203e
3
+ size 475399044
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dfloat11_config": {
3
+ "bytes_per_thread": 8,
4
+ "pattern_dict": {
5
+ "blocks\\.\\d+": [
6
+ "attn1.to_q",
7
+ "attn1.to_k",
8
+ "attn1.to_v",
9
+ "attn1.to_out.0",
10
+ "attn2.to_q",
11
+ "attn2.to_k",
12
+ "attn2.to_v",
13
+ "attn2.to_out.0",
14
+ "ffn.net.0.proj",
15
+ "ffn.net.2"
16
+ ]
17
+ },
18
+ "threads_per_block": [
19
+ 512
20
+ ],
21
+ "version": "0.3.1"
22
+ },
23
+ "model_type": "t5"
24
+ }
diffusion_pytorch_model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:553786373d35ec220f3780015bcfb90ab05497327a53cc8a30f29168410bec38
3
+ size 520531712