# MAexpv8 **Repository Path**: xthaf/maexpv8 ## Basic Information - **Project Name**: MAexpv8 - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: v1 - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-05-16 - **Last Updated**: 2025-05-16 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

MAexp
A Generic Platform for RL-based Multi-Agent Exploration

MAexp, a generic high-efficiency platform designed for multi-agent exploration, encompassing a diverse range of **scenarios** and **MARL algorithms**. The platform is developed in Python to smoothly integrate with existing reinforcement learning algorithms, and it is equally applicable to traditional exploration methods. In an effort to bridge the sim-to-real gap, all maps and agent properties within MAexp are modelled **continuously**, incorporating realistic physics to closely mirror real-world exploration. The framework of MAexp is as follow:

There are four kings of scenarios in MAexp: Random Obstacle, Maze, Indoor and Outdoor.

An introduction video can be seen on [Bilibili](https://www.bilibili.com/video/BV12w4m1y7Qm/?vd_source=4a84fda80f7a87775762c3f89840bbfd) or [YouTube](https://youtu.be/wOxqxUpdxlM). If you find this project useful, please consider giving it a star on GitHub! It helps the project gain visibility and supports the development. Thank you! ## Quick Start ### Installation ``` $ conda create -n maexp python=3.8 # or 3.9 $ conda activate maexp $ git clone https://github.com/Replicable-MARL/MARLlib.git && cd MARLlib $ pip install setuptools==65.5.0 $ pip install --user wheel==0.38.0 $ pip install -r requirements.txt $ pip install protobuf==3.20.0 $ pip install scikit-fmm $ cd /Path/To/MARLlib/marllib/patch $ python add_patch.py -y $ pip install tensorboard $ pip install einops $ pip install open3d ## if your torch could not work with cuda, you can try this: $ pip uninstall torch torchvision torchaudio $ pip install torch==1.12.1+cu116 torchvision==0.13.1+cu116 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu116 ``` ### Preparation (1) **Modify the code** of MARLLib and Ray. ``` $ cd MAexp $ python modify.py ``` (2) Change the parameters of **Ray** and **MARL algorithms** as follow: Ray: `/Path/To/envs/maexp/lib/python3.8/site-packages/marllib/marl/ray/ray.yaml` ```yaml # ray.yaml local_mode: False # True for debug mode only share_policy: "group" # individual(separate) / group(division) / all(share) evaluation_interval: 50 # evaluate model every 10 training iterations framework: "torch" # only for torch num_workers: 0 # thread number num_gpus: 1 # gpu to use num_cpus_per_worker: 5 # cpu allocate to each worker num_gpus_per_worker: 0.25 # gpu allocate to each worker checkpoint_freq: 100 # save model every 100 training iterations checkpoint_end: True # save model at the end of the exp restore_path: {"model_path": "", "params_path": ""} # load model and params path: 1. resume exp 2. rendering policy stop_iters: 9999999 # stop training at this iteration stop_timesteps: 2000000 # stop training at this timesteps stop_reward: 999999 # stop training at this reward seed: 321 # ray seed local_dir: "/Path/To/Your/Folder" # all results placed ``` MARL algorithms: `/Path/To/envs/maexp/lib/python3.8/site-packages/marllib/marl/algos/hyperparams/common/` ```yaml # ippo.yaml algo_args: use_gae: True lambda: 1.0 kl_coeff: 0.2 batch_episode: 2 num_sgd_iter: 2 vf_loss_coeff: 1.0 lr: 0.0001 entropy_coeff: 0.001 clip_param: 0.2 vf_clip_param: 10.0 batch_mode: "truncate_episodes" # itrpo.yaml algo_args: use_gae: True lambda: 1.0 gamma: 0.99 batch_episode: 2 kl_coeff: 0.2 num_sgd_iter: 2 grad_clip: 10 clip_param: 0.2 vf_loss_coeff: 1.0 entropy_coeff: 0.001 vf_clip_param: 10.0 batch_mode: "truncate_episodes" kl_threshold: 0.06 accept_ratio: 0.5 critic_lr: 0.001 # mappo.yaml algo_args: use_gae: True lambda: 1.0 kl_coeff: 0.2 batch_episode: 2 num_sgd_iter: 2 vf_loss_coeff: 1.0 lr: 0.0001 entropy_coeff: 0.001 clip_param: 0.2 vf_clip_param: 10.0 batch_mode: "truncate_episodes" # matrpo.yaml algo_args: use_gae: True lambda: 1.0 gamma: 0.99 batch_episode: 2 kl_coeff: 0.2 num_sgd_iter: 2 grad_clip: 10 clip_param: 0.2 vf_loss_coeff: 1.0 entropy_coeff: 0.001 vf_clip_param: 10.0 batch_mode: "truncate_episodes" kl_threshold: 0.06 accept_ratio: 0.5 critic_lr: 0.001 # vdppo.yaml algo_args: use_gae: True lambda: 1.0 kl_coeff: 0.2 batch_episode: 2 num_sgd_iter: 2 vf_loss_coeff: 1.0 lr: 0.0001 entropy_coeff: 0.001 clip_param: 0.2 vf_clip_param: 10.0 batch_mode: "truncate_episodes" mixer: "qmix" # qmix or vdn # vda2c.yaml algo_args: use_gae: True lambda: 1.0 vf_loss_coeff: 1.0 batch_episode: 2 batch_mode: "truncate_episodes" lr: 0.0001 entropy_coeff: 0.001 mixer: "qmix" # vdn ``` (3) Data Preparation Before you begin, you need to download the dataset and place it in the root directory of the project. Follow these steps: 1. Download the dataset from the provided link: [Google Drive](https://drive.google.com/file/d/1zbehe442sUbfFGGhgNmtvY89QiOmbiPW/view?usp=sharing) or [Baidu Netdisk](https://pan.baidu.com/share/init?surl=n-PYbfAluBkw7HPPGXJfxg&pwd=1234) 2. Extract the downloaded file. 3. Copy or move the dataset to the root directory of this project. ### Training (1) Change the parameters in the specific scenario yaml file (e.g. If you want to explore in maze, change the `./yaml/maze.yaml`) . **'is_train'** must be 'True'. If you are using GPUs for training, it is recommended to use at least **two**: one for sampling and another for the model itself. ```yaml device: 'cuda' # Cuda or cpu num_agent: 3 # number of agents in swarm is_train: True # if training, use True algo: 'vda2c' # MARL algorithms, could be [ippo,itrpo,mappo,matrpo,vdppo or vda2c] Map: training_map_num: 1 # how many map we use in the map_list below map_resolution: 1.0 region: 8 max_global_step: 31 map_list: ['map1','map19'] # map use in trianing scene: 'maze' # scenarios name, could not change ``` (2) Check the parameters in `env_v7.py`. If you need to debug, please turn `local_model` to `True` and `num_work` to `1`. ```python # num_workers represent the number of parallel environments for sampling; local_model use False for trainin, while use True for debug. method.fit(env, model, stop={'episode_reward_mean': 200000, 'timesteps_total': 10000000}, local_mode = False, num_workers = 4, share_policy='all', checkpoint_freq=300) ``` (3) Then run this in the terminal ``` python env_v7.py --yaml_file ./yaml/maze.yaml ``` ### Testing (1) Change the parameters: **'is_train'** must be 'False'. (2) Change the parameter in 'env_v7.py', add the `params_path` and `model_path`. ```python # params_path is the training parameters path; model_path is the checkpoint path;local_mode must be 'True'. method.fit(env, model, stop={'episode_reward_mean': 200000, 'timesteps_total': 10000000}, restore_path={'params_path': "/remote-home/ums_zhushaohao/2023/Multi-agent-Exploration/exp_results/vda2c_vit_crossatt_MAexp/VDA2CTrainer_maexp_MAexp_b92fd_00000_0_2024-03-14_09-17-48/params.json", # experiment configuration 'model_path': "/remote-home/ums_zhushaohao/2023/Multi-agent-Exploration/exp_results/vda2c_vit_crossatt_MAexp/VDA2CTrainer_maexp_MAexp_ef8e1_00000_0_2024-03-16_10-50-00/checkpoint_011400/checkpoint-11400"}, local_mode=True, num_workers = 0, share_policy='all') ``` (3) Then run this in the terminal ```cmd python env_v7.py --yaml_file ./yaml/maze.yaml ``` If you want to save the images, use this: ``` python env_v7.py --yaml_file ./yaml/maze.yaml --is_capture ``` ### Larger swarm MAexp can also accommodate a large number of robots, provided that `communication` and `action generation strategies` are properly adjusted to avoid `CUDA out-of-memory` errors while training the policy. You can visual the environment of random walk strategy with following step: (1) Please comment out the section in `env_v7.py` where the MARL training is used, and enable the code at the bottom that employs the random walk strategy. (2) Then run this in the terminal ``` python env_v7.py --yaml_file ./yaml/outdoor_large_swarm.yaml ``` ### Citation If you use MAexp in your research, please cite the [MAexp paper](https://ieeexplore.ieee.org/document/10611573) (accepted by ICRA 2024, see you in Yokohama~). ```tex @misc{zhu2024maexp, title={MAexp: A Generic Platform for RL-based Multi-Agent Exploration}, author={Shaohao Zhu and Jiacheng Zhou and Anjun Chen and Mingming Bai and Jiming Chen and Jinming Xu}, year={2024}, eprint={2404.12824}, archivePrefix={arXiv}, primaryClass={cs.RO} } ``` ### Author Shaohao Zhu ([zhushh9@zju.edu.cn](mailto:zhushh9@zju.edu.cn))

MAexp A Generic Platform for RL-based Multi-Agent Exploration

MAexp
A Generic Platform for RL-based Multi-Agent Exploration