Pixel Art LoRA — Stuffed Animal Sprite Converter

SDXL DreamBooth LoRA trained to convert stuffed animal photos into 8-bit pixel art sprites.

Developed as part of the 몽글마을 (Mongle Village) project — an AI-based stuffed animal persona app.


What it does

Converts a photo of a stuffed animal into a 16x16-style pixel art character sprite with white background, bold outlines, and flat colors.

Input photo → Background removal (rembg) → Edge detection (ControlNet) → Pixel art (this LoRA)

Model Details

Item Value
Base model stabilityai/stable-diffusion-xl-base-1.0
Training method DreamBooth LoRA
LoRA rank 32
Training steps 1,500
Learning rate 5e-5
Dataset 236 images (7 categories: animals, food, characters, objects, etc.)
Training time ~17 min on RTX 3060 (12GB VRAM)
File size 177.4 MB

Performance (20 test images, vs 6 other models)

Metric Score Rank
SSIM ↑ 0.5986 🥈 2nd
LPIPS ↓ (AlexNet) 0.6450 🥉 3rd
CLIP Score ↑ 27.92 4th
Color count ↓ 19,304 🥈 2nd
Generation success rate 100% 🥇 1st

How to use

Intended deployment

This repository is designed to be used as a HuggingFace Hub package for a RunPod GPU server.

RunPod server
  -> download this HuggingFace repo
  -> load pipeline.py
  -> run rembg + Canny + ControlNet + SDXL + LoRA + quantization
  -> expose the result through an API

HuggingFace stores the LoRA weights and pipeline code. The actual inference runs on RunPod.

Requirements

pip install -r requirements.txt

Download from HuggingFace and run locally/on RunPod

from huggingface_hub import snapshot_download
from PIL import Image

repo_dir = snapshot_download("Hadimeeee/pixel-art-lora-sdxl")

import sys
sys.path.insert(0, repo_dir)

from pipeline import load_pipeline

pipe = load_pipeline(repo_dir)
image = Image.open("your_image.jpg").convert("RGB")
result = pipe(image)["image"]
result.save("pixel_art_result.png")

RunPod serverless handler

Use runpod_handler.py as the serverless entrypoint. The handler expects a base64-encoded image:

{
  "input": {
    "image": "<base64 png or jpeg>",
    "num_inference_steps": 50,
    "guidance_scale": 7.5,
    "controlnet_conditioning_scale": 0.8,
    "strength": 0.75,
    "quantize": true,
    "n_colors": 32
  }
}

The response returns a base64-encoded PNG:

{
  "image": "<base64 png>",
  "rembg_ok": true
}

Pipeline breakdown

Step Tool Role
Background removal rembg Isolates the subject on white background
Edge detection OpenCV Canny (low=80, high=180) Extracts silhouette for ControlNet
Shape preservation diffusers/controlnet-canny-sdxl-1.0 Locks the original shape during generation
Style transfer This LoRA Applies pixel art style

Note: rembg and ControlNet are not included in this file. They are separate open-source tools loaded at inference time.


Tips

  • Works best on stuffed animals and character-shaped objects with clear silhouettes
  • If background removal fails, the pipeline automatically falls back to the original image
  • For more pixel-art-like results, apply color quantization after generation:
    result.quantize(colors=32, method=Image.Quantize.MEDIANCUT, dither=Image.Dither.NONE).convert("RGB")
    
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW

Model tree for Hadimeeee/pixel-art-lora-sdxl

Adapter
(8999)
this model
Free AI Image Generator No sign-up. Instant results. Open Now