fancyfeast/llama-joycaption-beta-one-hf-llava · Dataset release + contribution offer

Hi, I would like to request an update to the issue: https://github.com/fpgaminer/joycaption/issues/13 "Training data and inclusion".

This model performs favorably to Qwen2.5-VL 32B for my object detection tasks, but for some specific targets (clothing type) it performs worse. I would benefit greatly from being able to see the training dataset so I can:

Note the diversity of the set (maybe there are some lacking areas)
Assess the quality of labels and experiment with improvements to detail
Contribute my own high quality (image, label) pairs

Unlike what the readme suggests, my object detection task worked well with joycaption even with a custom system prompt and custom instruction separate from those listed.

I see a dataset here but do not know where to find the images corresponding to hashes: https://huggingface.co/datasets/fancyfeast/joy-captioning-20250328b (There are only 178 rows with a URL). A torrent due to the size of data would ease the file sharing.

Thank you