# MobileAgent **Repository Path**: haitaob/mobile-agent ## Basic Information - **Project Name**: MobileAgent - **Description**: 发现这个开源的AI工具,这里没有,当个搬运工从github上搬运过来了 - **Primary Language**: Python - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 0 - **Created**: 2024-09-20 - **Last Updated**: 2024-09-25 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README 
## 📺Demo ### Mobile-Agent-v3(注意:该视频没有加速处理) **YouTube** [](https://www.youtube.com/watch?v=EMbIpzqJld0) **哔哩哔哩** [](https://www.bilibili.com/video/BV1pPvyekEsa/?share_source=copy_web&vd_source=47ffcd57083495a8965c8cdbe1a751ae) ### PC-Agent **谷歌浏览器与钉钉** https://github.com/user-attachments/assets/b890a08f-8a2f-426d-9458-aa3699185030 **Word** https://github.com/user-attachments/assets/37f0a0a5-3d21-4232-9d1d-0fe845d0f77d ### Mobile-Agent-v2 https://github.com/X-PLUG/MobileAgent/assets/127390760/d907795d-b5b9-48bf-b1db-70cf3f45d155 ### Mobile-Agent https://github.com/X-PLUG/MobileAgent/assets/127390760/26c48fb0-67ed-4df6-97b2-aa0c18386d31 ## 📢新闻 * 🔥🔥[8.23]我们发布了一个支持Mac和Windows平台的**PC**操作助手PC-Agent, 通过Mobile-Agent-v2框架实现。 * 🔥🔥[7.29] Mobile-Agent获得了 ***第二十三届中国计算语言学大会*** (CCL 2024) 的 **最佳demo奖项**。在CCL 2024上,我们展示了即将开源的Mobile-Agent-v3,拥有更小的内存开销(8 GB)、更快的推理速度(每次操作10-15秒),并且使用开源模型。视频Demo请见上一个板块📺Demo。 * 🔥[6.27] 我们在[Hugging Face](https://huggingface.co/spaces/junyangwang0410/Mobile-Agent)和[ModelScope](https://modelscope.cn/studios/wangjunyang/Mobile-Agent-v2)发布了可以上传手机截图体验Mobile-Agent-v2的Demo,无需配置模型和设备,即刻便可体验。 * [6. 4] Modelscope-Agent 已经支持 Mobile-Agent-V2,基于 Android Adb Env,请查看 [application](https://github.com/modelscope/modelscope-agent/tree/master/apps/mobile_agent)。 * [6. 4] 我们发布了新一代移动设备操作助手 Mobile-Agent-v2, 通过多智能体协作实现有效导航。 * [3.10] Mobile-Agent 被 **ICLR 2024 Workshop on Large Language Model (LLM) Agents** 接收。 ## 📱版本 * [Mobile-Agent-v3](Mobile-Agent-v3/README_zh.md) * [Mobile-Agent-v2](Mobile-Agent-v2/README_zh.md) - 通过多代理协作有效导航的移动设备操作助手 * [Mobile-Agent](Mobile-Agent/README_zh.md) - 视觉感知方案的自动化移动设备操作智能体 ## ⭐Star历史 [](https://star-history.com/#X-PLUG/MobileAgent&Date) ## 引用 If you find Mobile-Agent useful for your research and applications, please cite using this BibTeX: ``` @article{wang2024mobile2, title={Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration}, author={Wang, Junyang and Xu, Haiyang and Jia Haitao and Zhang Xi and Yan, Ming and Shen, Weizhou and Zhang, Ji and Huang, Fei and Sang, Jitao}, journal={arXiv preprint arXiv:2406.01014}, year={2024} } @article{wang2024mobile, title={Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception}, author={Wang, Junyang and Xu, Haiyang and Ye, Jiabo and Yan, Ming and Shen, Weizhou and Zhang, Ji and Huang, Fei and Sang, Jitao}, journal={arXiv preprint arXiv:2401.16158}, year={2024} } ``` ## 📦相关项目 * [AppAgent: Multimodal Agents as Smartphone Users](https://github.com/mnotgod96/AppAgent) * [mPLUG-Owl & mPLUG-Owl2: Modularized Multimodal Large Language Model](https://github.com/X-PLUG/mPLUG-Owl) * [Qwen-VL: A Versatile Vision-Language Model for Understanding, Localization, Text Reading, and Beyond](https://github.com/QwenLM/Qwen-VL) * [GroundingDINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection](https://github.com/IDEA-Research/GroundingDINO) * [CLIP: Contrastive Language-Image Pretraining](https://github.com/openai/CLIP)