
vikhyatk/moondream2
Image-Text-to-Text
•
2B
•
Updated
•
156k
•
1.26k
https://huggingface.co/papers/2501.03006
Detect and estimate human poses in images and videos
Generate text based on your input