Update README.md
Browse files
README.md
CHANGED
|
@@ -9,4 +9,56 @@ tags:
|
|
| 9 |
- transformers
|
| 10 |
- pytorch
|
| 11 |
- gpt2
|
| 12 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
- transformers
|
| 10 |
- pytorch
|
| 11 |
- gpt2
|
| 12 |
+
---
|
| 13 |
+
|
| 14 |
+
> date trained: March, 2024
|
| 15 |
+
|
| 16 |
+
A lightweight prompt-generating model for generating danbooru tag-based prompts from few input tags.
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
|
| 20 |
+
The model is trained with the consideration of being a supplementary **lightweight** model for prompt refinements:
|
| 21 |
+
|
| 22 |
+
- **small**: with only **335M** parameters, the model offers an lightweight solution of prompt filling compared to much larger models such as phi-3 or llama 8b.
|
| 23 |
+
- **fast**: the model generates fairly quickly to not interfere with the main text generation as possible
|
| 24 |
+
- **low vram requirement**: the model takes less vram, so it saves more vram for the main image generation model.
|
| 25 |
+
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
The model is capable of:
|
| 29 |
+
|
| 30 |
+
- **character refinement**: filling more details about characters by inputting a danbooru character tag (eg. `hatsune miku`)
|
| 31 |
+
- **filling small details**: Filling creative details onto a faily thought out scene (add small details)
|
| 32 |
+
- **creative inspirations**: adding randomness to a short prompt for inspirations
|
| 33 |
+
- **prompt-to-prompt**: refine prompt elements to steer toward a better generation
|
| 34 |
+
|
| 35 |
+
|
| 36 |
+
|
| 37 |
+
## Training details
|
| 38 |
+
|
| 39 |
+
|
| 40 |
+
|
| 41 |
+
The model is finetuned on GPT2-medium, using 10M prompts from a refined full pixiv dataset, in the format of:
|
| 42 |
+
|
| 43 |
+
- **rating**: [safe | nsfw]
|
| 44 |
+
- **chara**: [danbooru character tags]
|
| 45 |
+
- **date**: [2020s | 2010s | 2000s]
|
| 46 |
+
- **quality**: [normal | good | excellent] (by image aesthetic ratings)
|
| 47 |
+
- **tags**: [rest of the danbooru general tags]
|
| 48 |
+
- **output**: model output
|
| 49 |
+
|
| 50 |
+
|
| 51 |
+
|
| 52 |
+
Some example training entries would look like this:
|
| 53 |
+
|
| 54 |
+
```
|
| 55 |
+
'<input rating="safe" chara="" date="2020s" quality="excellent" tags="1girl, long hair, white hair"><output>'
|
| 56 |
+
'<input rating="safe" chara="" date="2020s" quality="excellent" tags="1girl, purple hair, white hair"><output>'
|
| 57 |
+
'<input rating="safe" chara="" date="2020s" quality="excellent" tags="gothic lolita"><output>'
|
| 58 |
+
'<input rating="safe" chara="hatsune miku" date="2020s" quality="excellent" tags=""><output>'
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
|
| 62 |
+
|
| 63 |
+
You can find the full dataset soon.
|
| 64 |
+
|