nielsr HF Staff commited on
Commit
d36a76f
·
verified ·
1 Parent(s): 95f6058

Update model card: Refine pipeline tag and add paper link

Browse files

This PR aims to improve the model card by:

* Refining the `pipeline_tag` from `any-to-any` to `image-to-image`. This more accurately reflects the model's core functionality in instruction-guided image editing and will improve its discoverability on the Hugging Face Hub (https://huggingface.co/models?pipeline_tag=image-to-image).
* Adding a direct link to the research paper, [Visual Autoregressive Modeling for Instruction-Guided Image Editing](https://huggingface.co/papers/2508.15772), to the model card content.

Please note that `library_name: transformers` has not been added, despite some `transformers`-related files (`config.json`, `tokenizer_config.json`) being present. This decision is based on the provided `Basic Usage` code, which demonstrates a custom `infer.py` workflow for model loading and image generation. Adding `library_name: transformers` would imply direct compatibility with `transformers.AutoModel` for the image editing task, which is not supported by the current custom inference setup. This ensures that any automated "How to use" widget on the Hub accurately reflects the model's actual usage.

Files changed (1) hide show
  1. README.md +21 -17
README.md CHANGED
@@ -1,15 +1,19 @@
1
  ---
 
 
 
 
2
  license: mit
 
3
  tags:
4
  - image-editing
5
  - HiDream.ai
6
- language:
7
- - en
8
- pipeline_tag: any-to-any
9
- base_model:
10
- - FoundationVision/Infinity
11
  ---
12
- # VAREdit
 
 
 
13
 
14
  ![VAREdit Demo](assets/demo.jpg)
15
 
@@ -20,17 +24,17 @@ Try our online demos: [🤗VAREdit-8B-1024](https://huggingface.co/spaces/HiDrea
20
 
21
  ## 🌟 Key Features
22
 
23
- - **Strong Instruction Follow**: Follows instructions more accurately due to the autoregressive nature of the model.
24
- - **Efficient Inference**: Optimized for fast generation with less than 1 seconds for 8B model.
25
- - **Flexible Resolution**: Supports 512×512 and 1024×1024 image resolutions
26
  ![VAREdit Demo](assets/framework.jpg)
27
 
28
  ## 📊 Model Variants
29
 
30
- | Model Variant | Resolutions | HuggingFace Model | Time (H800) | VRAM (GB) |
31
- |------------------|--------------|----------------------------------------------------------------------------------|----------|-----------|
32
- | VAREdit-8B-512 | 512×512 | [VAREdit-8B-512](https://huggingface.co/HiDream-ai/VAREdit) | ~0.7s | 50.41 |
33
- | VAREdit-8B-1024 | 1024×1024 | [VAREdit-8B-1024](https://huggingface.co/HiDream-ai/VAREdit) | ~1.99s | 50.41 |
34
 
35
  ## 🚀 Quick Start
36
 
@@ -43,18 +47,18 @@ Before starting, ensure you have:
43
 
44
  ### Installation
45
 
46
- 1. **Clone the repository**
47
  ```bash
48
  git clone https://github.com/HiDream-ai/VAREdit.git
49
  cd VAREdit
50
  ```
51
 
52
- 2. **Install dependencies**
53
  ```bash
54
  pip install -r requirements.txt
55
  ```
56
 
57
- 3. **Download model checkpoints**
58
 
59
  Download the VAREdit model checkpoints:
60
  ```bash
@@ -91,7 +95,7 @@ edited_image = generate_image(
91
  ### Model Sampling Parameters
92
 
93
  | Parameter | Description | Default |
94
- |-----------|-------------|---------|
95
  | `cfg` | Classifier-free guidance scale | 3.0 |
96
  | `tau` | Temperature for sampling | 0.1 |
97
  | `seed` | Random seed for reproducibility | -1 (random) |
 
1
  ---
2
+ base_model:
3
+ - FoundationVision/Infinity
4
+ language:
5
+ - en
6
  license: mit
7
+ pipeline_tag: image-to-image
8
  tags:
9
  - image-editing
10
  - HiDream.ai
11
+ library_name: transformers
 
 
 
 
12
  ---
13
+
14
+ # VAREdit: Visual Autoregressive Modeling for Instruction-Guided Image Editing
15
+
16
+ [📄 Paper](https://huggingface.co/papers/2508.15772)
17
 
18
  ![VAREdit Demo](assets/demo.jpg)
19
 
 
24
 
25
  ## 🌟 Key Features
26
 
27
+ - **Strong Instruction Follow**: Follows instructions more accurately due to the autoregressive nature of the model.
28
+ - **Efficient Inference**: Optimized for fast generation with less than 1 seconds for 8B model.
29
+ - **Flexible Resolution**: Supports 512×512 and 1024×1024 image resolutions
30
  ![VAREdit Demo](assets/framework.jpg)
31
 
32
  ## 📊 Model Variants
33
 
34
+ | Model Variant | Resolutions | HuggingFace Model | Time (H800) | VRAM (GB) |
35
+ |:--------------|:------------|:---------------------------------------------------------------------------------|:----------|:----------|
36
+ | VAREdit-8B-512 | 512×512 | [VAREdit-8B-512](https://huggingface.co/HiDream-ai/VAREdit) | ~0.7s | 50.41 |
37
+ | VAREdit-8B-1024 | 1024×1024 | [VAREdit-8B-1024](https://huggingface.co/HiDream-ai/VAREdit) | ~1.99s | 50.41 |
38
 
39
  ## 🚀 Quick Start
40
 
 
47
 
48
  ### Installation
49
 
50
+ 1. **Clone the repository**
51
  ```bash
52
  git clone https://github.com/HiDream-ai/VAREdit.git
53
  cd VAREdit
54
  ```
55
 
56
+ 2. **Install dependencies**
57
  ```bash
58
  pip install -r requirements.txt
59
  ```
60
 
61
+ 3. **Download model checkpoints**
62
 
63
  Download the VAREdit model checkpoints:
64
  ```bash
 
95
  ### Model Sampling Parameters
96
 
97
  | Parameter | Description | Default |
98
+ |:----------|:------------|:--------|
99
  | `cfg` | Classifier-free guidance scale | 3.0 |
100
  | `tau` | Temperature for sampling | 0.1 |
101
  | `seed` | Random seed for reproducibility | -1 (random) |