Duplicate from neta-art/Neta-Lumina

3b3e716 verified 3 months ago

6.4 kB

	---
	base_model:
	- Alpha-VLLM/Lumina-Image-2.0
	license: other
	license_name: fair-ai-public-license-1.0-sd
	license_link: https://freedevproject.org/faipl-1.0-sd/
	---
	[中文版模型说明](https://huggingface.co/neta-art/Neta-Lumina/blob/main/README-ZH.md)

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/655319e00166ff6bd2351948/yp3wklEAT2JJ000dqqr1r.png)

	# Introduction

	Neta Lumina is a high‑quality anime‑style image‑generation model developed by Neta.art Lab.
	Building on the open‑source Lumina‑Image‑2.0 released by the Alpha‑VLLM team at Shanghai AI Laboratory, we fine‑tuned the model with a vast corpus of high‑quality anime images and multilingual tag data. The preliminary result is a compelling model with powerful comprehension and interpretation abilities (thanks to Gemma text encoder), ideal for illustration, posters, storyboards, character design, and more.

	## Key Features

	- Optimized for diverse creative scenarios such as Furry, Guofeng (traditional‑Chinese aesthetics), pets, etc.
	- Wide coverage of characters and styles, from popular to niche concepts. (Still support danbooru tags!)
	- Accurate natural‑language understanding with excellent adherence to complex prompts.
	- Native multilingual support, with Chinese, English, and Japanese recommended first.

	## Model Versions

	For models in alpha tests, requst access at https://huggingface.co/neta-art/NetaLumina_Alpha if you are interested. We will keep updating.


	### neta-lumina-beta-0624-raw

	- Primary Goal: General knowledge and anime‑style optimization
	- Data Set: >13 million anime‑style images
	- >46,000 A100 Hours
	- Higher upper limit, suitable for pro users. Check [Neta Lumina Prompt Book](https://nieta-art.feishu.cn/wiki/RVBgwvzBqiCvQ7kOMm1cM6NdnNc) for better results.

	### neta-lumina-beta-0624-aes

	- First beta release candidate
	- Primary Goal: Enhanced aesthetics, pose accuracy, and scene detail
	- Data Set: Hundreds of thousands of handpicked high‑quality anime images (fine‑tuned on an older version of raw model)
	- User-friendly, suitable for most people.
	<br>

	# How  to  Use

	[Try it at Hugging Face playground](https://huggingface.co/spaces/neta-art/NetaLumina_T2I_Playground)

	## ComfyUI
	Neta Lumina is built on the Lumina2 Diffusion Transformer (DiT) framework, please follow these steps precisely.

	### Environment Requirements

	Currently Neta Lumina runs only on ComfyUI:
	- Latest ComfyUI installation
	- ≥ 8 GB VRAM

	### Downloads & Installation

	Original (component) release

	1. Neta Lumina-Beta
	- Download link: https://huggingface.co/neta-art/Neta-Lumina/blob/main/neta-lumina-beta-0624.pth
	- Save path: `ComfyUI/models/unet/`
	2. Text Encoder (Gemma-2B)
	- Download link:https://huggingface.co/neta-art/Neta-Lumina/resolve/main/gemma_2_2b_fp16.safetensors
	- Save path: `ComfyUI/models/text_encoders/`
	3. VAE Model (16-Channel FLUX VAE)
	- Download link: https://huggingface.co/neta-art/Neta-Lumina/resolve/main/ae.safetensors
	- Save path: `ComfyUI/models/vae/`

	Workflow: load [`lumina_workflow.json`](https://huggingface.co/neta-art/NetaLumina_Alpha/blob/main/lumina_workflow.json) in ComfyUI.

	![image/png](https://cdn-uploads.huggingface.co/production/uploads/655319e00166ff6bd2351948/XPWf7M1OE5DogKwNlnQIk.png)

	- `UNETLoader` – loads the `.pth`
	- `VAELoader` – loads `ae.safetensors`
	- `CLIPLoader` – loads `gemma_2_2b_fp16.safetensors`
	- `Text Encoder` – connects positive /negative prompts to K Sampler

	Simple merged release
	Download [`neta-lumina-beta-0624-all-in-one.safetensors`](https://huggingface.co/neta-art/Neta-Lumina/tree/main),
	`md5sum = dca54fef3c64e942c1a62a741c4f9d8a`,
	you may use ComfyUI’s simple checkpoint loader workflow.

	### Recommended Settings

	- Sampler: `res_multistep`
	- Scheduler: `linear_quadratic`
	- Steps: 30
	- CFG (guidance): 4 – 5.5
	- EmptySD3LatentImage resolution: 1024 × 1024, 768 × 1532, or 968 × 1322
	<br>

	# Prompt Book

	Detailed prompt guidelines: [Neta Lumina Prompt Book](https://nieta-art.feishu.cn/wiki/RVBgwvzBqiCvQ7kOMm1cM6NdnNc)
	<br>

	# Community

	- Discord: https://discord.com/invite/TTTGccjbEa
	- QQ group: 785779037
	<br>

	# Roadmap

	## Model

	- Continous base‑model training to raise reasoning capability.
	- Aesthetic‑dataset iteration to improve anatomy, background richness, and overall appealness.
	- Smarter, more versatile tagging tools to lower the creative barrier.

	## Ecosystem

	- LoRA training tutorials and components
	- Experienced users may already fine‑tune via Lumina‑Image‑2.0’s open code.
	- Development of advanced control / style‑consistency features (e.g., [Omini Control](https://arxiv.org/pdf/2411.15098)). [Call for Collaboration!](https://discord.com/invite/TTTGccjbEa)
	<br>

	# License & Disclaimer

	- Neta Lumina is released under the [Fair AI Public License 1.0‑SD](https://freedevproject.org/faipl-1.0-sd/)
	- Any modifications, merges, or derivative models must themselves be open‑sourced.
	<br>

	# Participants & Contributors

	- Special thanks to the Alpha‑VLLM team for open‑sourcing Lumina‑Image‑2.0
	- Model development: Neta.art Lab (Civitai)
	- Core Trainer: li_li [Civitai](https://civitai.com/user/li_li) ・ [Hugging Face](https://huggingface.co/heziiiii)
	<br>
	- Partners
	- nebulae: [Civitai](https://civitai.com/user/kitarz) ・ [Hugging Face](https://huggingface.co/NebulaeWis)
	- [narugo1992](https://github.com/narugo1992) & [deepghs](https://huggingface.co/deepghs): open datasets, processing tools, and models
	- [Naifu](https://github.com/Mikubill/naifu) trainer at [Mikubill](https://github.com/Mikubill)
	<br>

	# Community Contributors

	Evaluators & developers: 二小姐, spawner, Rnglg2
	Other contributors: 沉迷摸鱼, poi氵, ashan, 十分无奈, GHOSTLXH, wenaka, iiiiii, 年糕特工队, 恩匹希, 奶冻美宣集, mumu, yizyin, smile
	<br>

	# Appendix & Resources

	- TeaCache: https://github.com/spawner1145/CUI-Lumina2-TeaCache
	- Advanced samplers & TeaCache guide (by spawner): https://docs.qq.com/doc/DZEFKb1ZrZVZiUmxw?nlc=1
	- Neta Lumina ComfyUI Manual (in Chinese): https://docs.qq.com/doc/DZEVQZFdtaERPdXVh