rcfg commited on
Commit
b3e7f12
·
verified ·
1 Parent(s): 1e868ca

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +37 -0
README.md ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: transformers
3
+ license: apache-2.0
4
+ tags:
5
+ - vision
6
+ - image-captioning
7
+ - blip
8
+ - multimodal
9
+ - fashion
10
+ datasets:
11
+ - Marqo/fashion200k
12
+ base_model:
13
+ - Salesforce/blip-image-captioning-large
14
+ ---
15
+
16
+ # Fine-Tuned BLIP Model for Fashion Image Captioning
17
+
18
+ This is a fine-tuned BLIP (Bootstrapped Language-Image Pretraining) model specifically designed for **fashion image captioning**. It was fine-tuned on the **Marqo Fashion Dataset** to generate descriptive and contextually relevant captions for fashion-related images.
19
+
20
+ ## Model Details
21
+
22
+ - **Model Type:** BLIP (Vision-Language Pretraining)
23
+ - **Architecture:** BLIP uses a multimodal transformer architecture to jointly model visual and textual information.
24
+ - **Fine-Tuning Dataset:** [Marqo Fashion Dataset](https://github.com/marqo-ai/marqo) (a dataset containing fashion images and corresponding captions)
25
+ - **Task:** Fashion Image Captioning
26
+ - **License:** Apache 2.0
27
+
28
+ ## Usage
29
+
30
+ You can use this model with the Hugging Face `transformers` library for fashion image captioning tasks.
31
+
32
+ ### Installation
33
+
34
+ First, install the required libraries:
35
+
36
+ ```bash
37
+ pip install transformers torch