# UI-TARS-desktop **Repository Path**: 461827813/UI-TARS-desktop ## Basic Information - **Project Name**: UI-TARS-desktop - **Description**: No description available - **Primary Language**: Unknown - **License**: Apache-2.0 - **Default Branch**: 11 - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-04-22 - **Last Updated**: 2025-04-22 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
   π Paper   
| π€ Hugging Face Models  
|   π€ ModelScope  
π₯οΈ Desktop Application   
|    π Midscene (use in browser)
2. Enable the permission of **UI TARS** in MacOS:
- System Settings -> Privacy & Security -> **Accessibility**
- System Settings -> Privacy & Security -> **Screen Recording**
3. Then open **UI TARS** application, you can see the following interface:
#### Windows
**Still to run** the application, you can see the following interface:
### Deployment
#### Cloud Deployment
We recommend using HuggingFace Inference Endpoints for fast deployment.
We provide two docs for users to refer:
English version: [GUI Model Deployment Guide](https://juniper-switch-f10.notion.site/GUI-Model-Deployment-Guide-17b5350241e280058e98cea60317de71)
δΈζη: [GUI樑ει¨η½²ζη¨](https://bytedance.sg.larkoffice.com/docx/TCcudYwyIox5vyxiSDLlgIsTgWf#U94rdCxzBoJMLex38NPlHL21gNb)
#### Local Deployment [vLLM]
We recommend using vLLM for fast deployment and inference. You need to use `vllm>=0.6.1`.
```bash
pip install -U transformers
VLLM_VERSION=0.6.6
CUDA_VERSION=cu124
pip install vllm==${VLLM_VERSION} --extra-index-url https://download.pytorch.org/whl/${CUDA_VERSION}
```
##### Download the Model
We provide three model sizes on Hugging Face: **2B**, **7B**, and **72B**. To achieve the best performance, we recommend using the **7B-DPO** or **72B-DPO** model (based on your hardware configuration):
- [2B-SFT](https://huggingface.co/bytedance-research/UI-TARS-2B-SFT)
- [7B-SFT](https://huggingface.co/bytedance-research/UI-TARS-7B-SFT)
- [7B-DPO](https://huggingface.co/bytedance-research/UI-TARS-7B-DPO)
- [72B-SFT](https://huggingface.co/bytedance-research/UI-TARS-72B-SFT)
- [72B-DPO](https://huggingface.co/bytedance-research/UI-TARS-72B-DPO)
##### Start an OpenAI API Service
Run the command below to start an OpenAI-compatible API service:
```bash
python -m vllm.entrypoints.openai.api_server --served-model-name ui-tars --model
> **Note**: VLM Base Url is OpenAI compatible API endpoints (see [OpenAI API protocol document](https://platform.openai.com/docs/guides/vision/uploading-base-64-encoded-images) for more details).
## Development
Just simple two steps to run the application:
```bash
pnpm install
pnpm run dev
```
> **Note**: On MacOS, you need to grant permissions to the app (e.g., iTerm2, Terminal) you are using to run commands.
### Testing
```bash
# Unit test
pnpm run test
# E2E test
pnpm run test:e2e
```
## System Requirements
- Node.js >= 20
- Supported Operating Systems:
- Windows 10/11
- macOS 10.15+
## License
UI-TARS Desktop is licensed under the Apache License 2.0.
## Citation
If you find our paper and code useful in your research, please consider giving a star :star: and citation :pencil:
```BibTeX
@article{qin2025ui,
title={UI-TARS: Pioneering Automated GUI Interaction with Native Agents},
author={Qin, Yujia and Ye, Yining and Fang, Junjie and Wang, Haoming and Liang, Shihao and Tian, Shizuo and Zhang, Junda and Li, Jiahao and Li, Yunxin and Huang, Shijue and others},
journal={arXiv preprint arXiv:2501.12326},
year={2025}
}
```