Le0jc commited on
Commit
94ca3cd
·
1 Parent(s): 8b237ef

update README

Browse files
Files changed (1) hide show
  1. README.md +59 -0
README.md CHANGED
@@ -12,3 +12,62 @@ tags:
12
  - LoRA
13
  - adapter
14
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
12
  - LoRA
13
  - adapter
14
  ---
15
+
16
+ Please refer to our github for more info: https://github.com/alibaba/wan-toy-transform
17
+
18
+
19
+ <div align="center">
20
+ <h2><center>Wan Toy Transform</h2>
21
+ <br>
22
+ Alibaba Research Intelligence Computing
23
+ <br>
24
+ <a href="https://github.com/alibaba/wan-toy-transform"><img src='https://img.shields.io/badge/Github-Link-black'></a>
25
+ <a href='https://modelscope.cn/models/Alibaba_Research_Intelligence_Computing/wan-toy-transform'><img src='https://img.shields.io/badge/🤖_ModelScope-weights-%23654dfc'></a>
26
+ <a href='https://huggingface.co/Alibaba-Research-Intelligence-Computing/wan-toy-transform'><img src='https://img.shields.io/badge/🤗_HuggingFace-weights-%23ff9e0e'></a>
27
+ <br>
28
+ </div>
29
+
30
+ This is a LoRA model finetuned on [Wan-I2V-14B-480P](https://github.com/Wan-Video/Wan2.1). It turns things in the image into fluffy toys.
31
+
32
+ ## 🐍 Installation
33
+
34
+ ```bash
35
+ # Python 3.12 and PyTorch 2.6.0 are tested.
36
+ pip install torch==2.6.0 torchvision==0.21.0 --index-url https://download.pytorch.org/whl/cu124
37
+ pip install -r requirements.txt
38
+ ```
39
+
40
+ ## 🔄 Inference
41
+
42
+ ```bash
43
+ python generate.py --prompt "The video opens with a clear view of a $name. Then it transforms to a b6e9636 JellyCat-style $name. It has a face and a cute, fluffy and playful appearance." --image $image_path --save_file "output.mp4" --offload_type leaf_level
44
+ ```
45
+
46
+ Note:
47
+
48
+ - Change `$name` to the object name you want to transform.
49
+ - `$image_path` is the path to the first frame image.
50
+ - Choose `--offload_type` from ['leaf_level', 'block_level', 'none', 'model']. More details can be found [here](https://huggingface.co/docs/diffusers/optimization/memory#group-offloading).
51
+ - VRAM usage and generation time of different `--offload_type` are listed below.
52
+
53
+ | `--offload_type` | VRAM Usage | Generation Time (NVIDIA A100) |
54
+ | ------------------------------------ | ---------- | ----------------------------- |
55
+ | leaf_level | 11.9 GB | 17m17s |
56
+ | block_level (num_blocks_per_group=1) | 20.5 GB | 16m48s |
57
+ | model | 39.4 GB | 16m24s |
58
+ | none | 55.9 GB | 16m08s |
59
+
60
+ ## 🤝 Acknowledgements
61
+
62
+ Special thanks to these projects for their contributions to the community!
63
+
64
+ - [Wan2.1](https://github.com/Wan-Video/Wan2.1)
65
+ - [diffusion-pipe](https://github.com/tdrussell/diffusion-pipe)
66
+ - [diffusers](https://github.com/huggingface/diffusers)
67
+
68
+ ## 📄 Our previous work
69
+
70
+ - [Tora: Trajectory-oriented Diffusion Transformer for Video Generation](https://github.com/alibaba/Tora)
71
+
72
+ - [AnimateAnything: Fine Grained Open Domain Image Animation with Motion Guidance](https://github.com/alibaba/animate-anything)
73
+