Update README.md
Browse files
README.md
CHANGED
@@ -9,4 +9,56 @@ tags:
|
|
9 |
- transformers
|
10 |
- pytorch
|
11 |
- gpt2
|
12 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
- transformers
|
10 |
- pytorch
|
11 |
- gpt2
|
12 |
+
---
|
13 |
+
|
14 |
+
> date trained: March, 2024
|
15 |
+
|
16 |
+
A lightweight prompt-generating model for generating danbooru tag-based prompts from few input tags.
|
17 |
+
|
18 |
+
|
19 |
+
|
20 |
+
The model is trained with the consideration of being a supplementary **lightweight** model for prompt refinements:
|
21 |
+
|
22 |
+
- **small**: with only **335M** parameters, the model offers an lightweight solution of prompt filling compared to much larger models such as phi-3 or llama 8b.
|
23 |
+
- **fast**: the model generates fairly quickly to not interfere with the main text generation as possible
|
24 |
+
- **low vram requirement**: the model takes less vram, so it saves more vram for the main image generation model.
|
25 |
+
|
26 |
+
|
27 |
+
|
28 |
+
The model is capable of:
|
29 |
+
|
30 |
+
- **character refinement**: filling more details about characters by inputting a danbooru character tag (eg. `hatsune miku`)
|
31 |
+
- **filling small details**: Filling creative details onto a faily thought out scene (add small details)
|
32 |
+
- **creative inspirations**: adding randomness to a short prompt for inspirations
|
33 |
+
- **prompt-to-prompt**: refine prompt elements to steer toward a better generation
|
34 |
+
|
35 |
+
|
36 |
+
|
37 |
+
## Training details
|
38 |
+
|
39 |
+
|
40 |
+
|
41 |
+
The model is finetuned on GPT2-medium, using 10M prompts from a refined full pixiv dataset, in the format of:
|
42 |
+
|
43 |
+
- **rating**: [safe | nsfw]
|
44 |
+
- **chara**: [danbooru character tags]
|
45 |
+
- **date**: [2020s | 2010s | 2000s]
|
46 |
+
- **quality**: [normal | good | excellent] (by image aesthetic ratings)
|
47 |
+
- **tags**: [rest of the danbooru general tags]
|
48 |
+
- **output**: model output
|
49 |
+
|
50 |
+
|
51 |
+
|
52 |
+
Some example training entries would look like this:
|
53 |
+
|
54 |
+
```
|
55 |
+
'<input rating="safe" chara="" date="2020s" quality="excellent" tags="1girl, long hair, white hair"><output>'
|
56 |
+
'<input rating="safe" chara="" date="2020s" quality="excellent" tags="1girl, purple hair, white hair"><output>'
|
57 |
+
'<input rating="safe" chara="" date="2020s" quality="excellent" tags="gothic lolita"><output>'
|
58 |
+
'<input rating="safe" chara="hatsune miku" date="2020s" quality="excellent" tags=""><output>'
|
59 |
+
```
|
60 |
+
|
61 |
+
|
62 |
+
|
63 |
+
You can find the full dataset soon.
|
64 |
+
|