nielsr HF Staff commited on
Commit
dbee8f4
·
verified ·
1 Parent(s): 7c1644c

Improve model card: Add pipeline tag, library name, update paper link, and add project page

Browse files

This PR enhances the model card for GUI-Owl-7B by:

* Adding the `pipeline_tag: image-text-to-text` to improve discoverability on the Hugging Face Hub, as the model is a visual language model capable of multimodal GUI automation.
* Specifying `library_name: transformers` to enable the automated "how to use" code snippet, as the model's `config.json` indicates compatibility with the `transformers` library (e.g., `Qwen2_5_VLForConditionalGeneration`, `Qwen2Tokenizer`, `Qwen2_5_VLProcessor`).
* Updating the "Paper" link to the official Hugging Face Papers page: [Mobile-Agent-v3: Foundamental Agents for GUI Automation](https://huggingface.co/papers/2508.15144).
* Adding a "Project Page" link: [https://osatlas.github.io/](https://osatlas.github.io/), which was identified in the associated GitHub repository's README.

These updates will make the model more accessible, discoverable, and user-friendly for the community.

Files changed (1) hide show
  1. README.md +10 -7
README.md CHANGED
@@ -1,11 +1,13 @@
1
  ---
2
- license: mit
3
- language:
4
- - en
5
  base_model:
6
  - Qwen/Qwen2.5-VL-7B-Instruct
 
 
 
7
  tags:
8
  - arxiv:2508.15144
 
 
9
  ---
10
 
11
  # GUI-Owl
@@ -16,9 +18,10 @@ tags:
16
 
17
  GUI-Owl is a model series developed as part of the Mobile-Agent-V3 project. It achieves state-of-the-art performance across a range of GUI automation benchmarks, including ScreenSpot-V2, ScreenSpot-Pro, OSWorld-G, MMBench-GUI, Android Control, Android World, and OSWorld. Furthermore, it can be instantiated as various specialized agents within the Mobile-Agent-V3 multi-agent framework to accomplish more complex tasks.
18
 
19
- * **Paper**: [Paper Link](https://github.com/X-PLUG/MobileAgent/blob/main/Mobile-Agent-v3/assets/MobileAgentV3_Tech.pdf)
20
- * **GitHub Repository**: https://github.com/X-PLUG/MobileAgent
21
- * **Online Demo**: Comming soon
 
22
 
23
  ## Performance
24
 
@@ -91,4 +94,4 @@ If you find our paper and model useful in your research, feel free to give us a
91
  primaryClass={cs.AI},
92
  url={https://arxiv.org/abs/2508.15144},
93
  }
94
- ```
 
1
  ---
 
 
 
2
  base_model:
3
  - Qwen/Qwen2.5-VL-7B-Instruct
4
+ language:
5
+ - en
6
+ license: mit
7
  tags:
8
  - arxiv:2508.15144
9
+ pipeline_tag: image-text-to-text
10
+ library_name: transformers
11
  ---
12
 
13
  # GUI-Owl
 
18
 
19
  GUI-Owl is a model series developed as part of the Mobile-Agent-V3 project. It achieves state-of-the-art performance across a range of GUI automation benchmarks, including ScreenSpot-V2, ScreenSpot-Pro, OSWorld-G, MMBench-GUI, Android Control, Android World, and OSWorld. Furthermore, it can be instantiated as various specialized agents within the Mobile-Agent-V3 multi-agent framework to accomplish more complex tasks.
20
 
21
+ * **Paper**: [Mobile-Agent-v3: Foundamental Agents for GUI Automation](https://huggingface.co/papers/2508.15144)
22
+ * **Project Page**: [https://osatlas.github.io/](https://osatlas.github.io/)
23
+ * **GitHub Repository**: https://github.com/X-PLUG/MobileAgent
24
+ * **Online Demo**: Comming soon
25
 
26
  ## Performance
27
 
 
94
  primaryClass={cs.AI},
95
  url={https://arxiv.org/abs/2508.15144},
96
  }
97
+ ```