sugarquark
/

wan2.2-sdxl-decoder

Model card Files Files and versions

wan2.2-sdxl-decoder / README.md

sugarquark's picture

Update README.md

16e73eb verified 3 months ago

|

1.07 kB

	---
	license: apache-2.0
	base_model:
	- Wan-AI/Wan2.2-TI2V-5B-Diffusers
	---

	# SDXL latent to image

	It takes the 4ch latent and decodes it with the [WanDecoder3d module](https://huggingface.co/Wan-AI/Wan2.2-TI2V-5B-Diffusers/tree/main/vae).

	After a short warmup phase, the head of the WanDecoder3d became part of the process.

	During the warmup, the model learned the color space. Later on, the imported/modified head improved the stability of the image.

	```python
	if __name__ == '__main__':
	model = WanXL()
	vae = AutoencoderKLWan.from_pretrained('Wan-AI/Wan2.2-TI2V-5B-Diffusers', subfolder='vae')
	z = torch.randn(1, 4, 128, 128) # (B, C, H, W)
	x = model(z) # (B, C, T, H, W)
	image = transforms.functional.to_pil_image(model.decode_by(vae, x).squeeze())
	```

	The SDXL latent was generated by this [model](https://huggingface.co/Laxhar/noobai-XL-Vpred-1.0/tree/main/vae).

	As shown in the example, the target image size is preferably 1024px due to the lossy compression of the original encoded data.

	## Datasets

	- 12TPICS
	- jlbaker361/flickr_humans