# soundspaces-challenge **Repository Path**: facebookresearch/soundspaces-challenge ## Basic Information - **Project Name**: soundspaces-challenge - **Description**: Starter code for SoundSpaces challenge at CVPR 21's Embodied AI workshop - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: cvpr_21 - **Homepage**: https://soundspaces.org/challenge - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2023-07-30 - **Last Updated**: 2023-08-01 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README

-------------------------------------------------------------------------------- # SoundSpaces Challenge 2021 This repository contains starter code for the 2021 challenge, details of the tasks, and training and evaluation setups. For an overview of SoundSpaces Challenge visit [soundspaces.org/challenge](https://soundspaces.org/challenge/). This year, we are hosting challenges on audio-visual navigation task [1], where an agent is tasked to find a sound-making object in unmapped 3D environments with visual and auditory perception. ## AudioNav Task In AudioGoal navigation (AudioNav), an agent is spawned at a random starting position and orientation in an unseen environment. A sound-emitting object is also randomly spawned at a location in the same environment. The agent receives a one-second audio input in the form of a waveform at each time step and needs to navigate to the target location. No ground-truth map is available and the agent must only use its sensory input (audio and RGB-D) to navigate. ### Dataset The challenge will be conducted on the SoundSpaces Dataset, which is based on AI Habitat, Matterport3D, and Replica. For this challenge, we use the Matterport3D dataset due to its diversity and scale of environments. This challenge focuses on evaluating agents' ability to generalize to unheard sounds and unseen environments. The training and validation splits are the same as used in Unheard Sound experiments reported in the SoundSpaces paper. They can be downloaded from the SoundSpaces dataset page (including minival). For the challenge test split, we will use new sounds that are not currently publicly available on the website. ### Evaluation After calling the STOP action, the agent is evaluated using the 'Success weighted by Path Length' (SPL) metric [2].

An episode is deemed successful if on calling the STOP action, the agent is within 0.36m (2x agent-radius) of the goal position. ## Participation Guidelines Participate in the contest by registering on the [EvalAI challenge page](https://evalai.cloudcv.org/web/challenges/challenge-page/806/overview) and creating a team. Participants will upload docker containers with their agents that evaluated on a AWS GPU-enabled instance. Before pushing the submissions for remote evaluation, participants should test the submission docker locally to make sure it is working. Instructions for training, local evaluation, and online submission are provided below. ### Local Evaluation 1. Clone the challenge repository: ```bash git clone https://github.com/changanvr/soundspaces-challenge.git cd soundspaces-challenge ``` 1. Implement your own agent or try one of ours. We provide an agent in `agent.py` that takes random actions: ```python import habitat import soundspaces class RandomAgent(habitat.Agent): def reset(self): pass def act(self, observations): return {"action": numpy.random.choice(task_config.TASK.POSSIBLE_ACTIONS)} def main(): agent = RandomAgent(task_config=config) challenge = soundspaces.Challenge() challenge.submit(agent) ``` 1. Install [nvidia-docker v2](https://github.com/NVIDIA/nvidia-docker) following instructions here: [https://github.com/nvidia/nvidia-docker/wiki/Installation-(version-2.0)](https://github.com/nvidia/nvidia-docker/wiki/Installation-(version-2.0)). Note: only supports Linux; no Windows or MacOS. 1. Modify the provided Dockerfile if you need custom modifications. Let's say your code needs `pytorch`, these dependencies should be pip installed inside a conda environment called `soundspaces` that is shipped with our soundspaces/challenge docker, as shown below: ```dockerfile FROM soundspaces/challenge:2021 # install dependencies in the soundspaces conda environment RUN /bin/bash -c ". activate soundspaces; pip install torch" ADD agent.py /agent.py ``` Build your docker container: `docker build . --file audionav.dockerfile -t audionav_submission` or using `docker build . --file audionav.dockerfile -t audionav_submission`. (Note: you may need `sudo` priviliges to run this command.) In addition to a random agent, we provide the end-to-end RL agent as well as an example checkpoint file (trained) for participants to start with. 2. Following instructions for downloading SoundSpaces [dataset](https://github.com/facebookresearch/sound-spaces/tree/master/soundspaces) and place all data under data/ folder. **Using Symlinks:** If you used symlinks (i.e. `ln -s`) to link to an existing downloaded data, there is an additional step. Make sure there is only one level of symlink (instead of a symlink to a symlink link to a .... symlink) with ```bash ln -f -s $(realpath data/scene_datasets/mp3d) \ data/scene_datasets/mp3d ``` In general, docker does not work very well with symlinks. If you find symlinks not working, simply create a replica of data folder under this directory. 3. Evaluate your docker container locally: ```bash # Testing AudioNav ./test_locally_audionav_rgbd.sh --docker-name audionav_submission ``` If the above command runs successfully you will get an output similar to: ``` 2019-02-14 21:23:51,798 initializing sim Sim-v0 2019-02-14 21:23:52,820 initializing task Nav-v0 2020-02-14 21:23:56,339 distance_to_goal: 5.205519378185272 2020-02-14 21:23:56,339 spl: 0.0 ``` Note: this same command will be run to evaluate your agent for the leaderboard. **Please submit your docker for remote evaluation (below) only if it runs successfully on your local setup.** ### Online submission Follow instructions in the `submit` tab of the EvalAI challenge page (coming soon) to submit your docker image. Note that you will need a version of EvalAI `>= 1.3.5`. Pasting those instructions here for convenience: ```bash # Installing EvalAI Command Line Interface pip install "evalai>=1.3.5" # Set EvalAI account token evalai set_token # Push docker image to EvalAI docker registry evalai push audionav_submission:latest --phase ``` Valid challenge phases are `soundspaces21-audionav-{minival, test-std, test-ch}`. The challenge consists of the following phases: 1. **Minival phase**: This split is same as the one used in `./test_locally_audionav_rgbd.sh`. The purpose of this phase/split is sanity checking -- to confirm that our remote evaluation reports the same result as the one you're seeing locally. Each team is allowed maximum of 30 submission per day for this phase, but please use them judiciously. We will block and disqualify teams that spam our servers. 2. **Test Standard phase**: The purpose of this phase/split is to serve as the public leaderboard establishing the state of the art; this is what should be used to report results in papers. Each team is allowed maximum of 10 submission per day for this phase, but again, please use them judiciously. Don't overfit to the test set. 3. **Test Challenge phase**: This phase/split will be used to decide challenge winners. Each team is allowed 5 submissions per day until the end of challenge submission phase. The highest performing of these 5 will be automatically chosen. Results on this split will not be made public until the announcement of final results at the [Embodied AI workshop at CVPR](https://embodied-ai.org/). Note: Your agent will be evaluated on 1000-2000 episodes and will have a total available time of 24 hours to finish. Your submissions will be evaluated on AWS EC2 p2.xlarge instance which has a Tesla K80 GPU (12 GB Memory), 4 CPU cores, and 61 GB RAM. If you need more time/resources for evaluation of your submission please get in touch. If you face any issues or have questions you can ask them by opening an issue on this repository. ### AudioNav Baselines and Starter Code We included both the configs and dockerfiles for [av-nav](https://github.com/facebookresearch/sound-spaces/tree/master/ss_baselines/av_nav) and [av-wan](https://github.com/facebookresearch/sound-spaces/tree/master/ss_baselines/av_wan). Note that the [MapNav environment](https://github.com/facebookresearch/sound-spaces/blob/soundspaces-challenge/ss_baselines/av_wan/mapnav_env.py) used by av-wan is baked into the environment container and can't be changed. We suggest you to re-write that planning for loop in the agent code if you want to modify mapping or planning. ## Acknowledgments Thank Oleksandr Maksymets and Rishabh Jain for the technical support. And thank Habitat team for the challenge template. ## References [1] [SoundSpaces: Audio-Visual Navigation in 3D Environments](https://arxiv.org/pdf/1912.11474.pdf). Changan Chen\*, Unnat Jain\*, Carl Schissler, Sebastia Vicenc Amengual Gari, Ziad Al-Halah, Vamsi Krishna Ithapu, Philip Robinson, Kristen Grauman. ECCV, 2020. [2] [On evaluation of embodied navigation agents](https://arxiv.org/abs/1807.06757). Peter Anderson, Angel Chang, Devendra Singh Chaplot, Alexey Dosovitskiy, Saurabh Gupta, Vladlen Koltun, Jana Kosecka, Jitendra Malik, Roozbeh Mottaghi, Manolis Savva, Amir R. Zamir. arXiv:1807.06757, 2018. ## License This repo is MIT licensed, as found in the LICENSE file.