# semantic_slam **Repository Path**: wx_1cb703da06/semantic_slam ## Basic Information - **Project Name**: semantic_slam - **Description**: Real time semantic slam in ROS with a hand held RGB-D camera - **Primary Language**: C++ - **License**: GPL-3.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 0 - **Created**: 2021-04-14 - **Last Updated**: 2023-12-11 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Semantic SLAM ***Author:*** Xuan Zhang, ***Supervisor:*** David Filliat Research internship @ENSTA ParisTech Semantic SLAM can generate a 3D voxel based semantic map using only a hand held RGB-D camera (e.g. Asus xtion) in real time. We use ORB_SLAM2 as SLAM backend, a CNN (PSPNet) to produce semantic prediction and fuse semantic information into a octomap. Note that our system can also be configured to generate rgb octomap without semantic information. Semantic octomap | RGB octomap :-------------------------:|:-------------------------: ![alt text](docs/images/max.png) | ![alt text](docs/images/rgb.png) ### Project Report & Demo: - Demo: [[Youtube](https://youtu.be/IwQaRnFmRuU)] [[Youku](http://player.youku.com/embed/XMzc5NTI0OTcyNA==)] - Project report: [[Google](https://drive.google.com/open?id=1Uk5YLMyoyMGMkE8FtJCNrZp_brGh5z3M)] [[Baidu](https://pan.baidu.com/s/1i67xQAZhqVNJayeg5gmegA)] ### Acknowledgement This work cannot be done without many open source projets. Special thanks to - [ORB_SLAM2](https://github.com/raulmur/ORB_SLAM2), used as our SLAM backend. - [pytorch-semseg](https://github.com/meetshah1995/pytorch-semseg), used as our semantic segmantation library. - [octomap](https://github.com/OctoMap/octomap), used as our map representation. - [pcl library](http://pointclouds.org/), used for point cloud processing. # License This project is released under a [GPLv3 license](./LICENSE.txt). # Overview ![alt text](docs/images/overview.jpeg) # Dependencies - Openni2_launch ```sh sudo apt-get install ros-kinetic-openni2-launch ``` - ORB_SLAM2 We use [ORB_SLAM2](https://github.com/raulmur/ORB_SLAM2) as SLAM backend. Please refer to the official repo for installation dependencies. - PyTorch 0.4.0 - For other dependencies, please see [semantic_slam/package.xml](./semantic_slam/package.xml) # Installation ### Build ORB_SLAM2 After installing dependencies for ORB_SLAM. You should first build the library. ```sh cd ORB_SLAM2 ./build.sh ``` ### Install dependencies ```sh rosdep install semantic_slam ``` ### Make ```sh cd catkin_make ``` # Run with camera ### Launch rgbd camera ```sh roslaunch semantic_slam camera.launch ``` ### Run ORB_SLAM2 node ```sh roslaunch semantic_slam slam.launch ``` When the slam system has finished initialization, try to move the camera and check if the camera trajectory in the viewer is reasonable, reset SLAM if not. ### Run semantic_mapping You can now run the semantic_cloud node and the octomap_generator node. You will have to provide trained models, see links below. ```sh roslaunch semantic_slam semantic_mapping.launch ``` This will also launch rviz for visualization. You can then move around the camera and construct semantic map. Make sure SLAM is not losing itself. If you are constructing a semantic map, you can toggle the display color between semantic color and rgb color by running ```sh rosservice call toggle_use_semantic_color ``` # Run with ros bag If you want to test semantic mapping without a camera, you can also run a rosbag. ### Download rosbag with camera position (tracked by SLAM) [demo.bag](https://drive.google.com/file/d/1j12c_Fruu-ylO1FHYC4sbmlG9IutYJQg/view?usp=sharing) ### Run semantic_mapping ```sh roslaunch semantic_slam semantic mapping.launch ``` ### Play ROS bag ```sh rosbag play demo.bag ``` # Trained models - [Model trained on ade20k dataset](https://drive.google.com/file/d/1u_BEWdVIYiDnpVmAxwME1z3rnWWkjxm5/view?usp=sharing) - [Model fine tuned on SUNRGBD dataset](https://drive.google.com/file/d/1t26t2VHNOzmjH-0lDTdYzXBACOV_4-eL/view?usp=sharing) # Run time Evaluation is done on a computer with 6 Xeon 1.7 GHz CPU and one GeForce GTX Titan Z GPU. Input image size is 640×480 recorded by a camera Asus Xtion Pro. When our system works together, ORB-SLAM works at about 15 Hz (the setting is 30 Hz). Point cloud generation alone can run at 30 Hz. Semantic segmentation runs at about 2 to 3 Hz. Octomap insertion and visualization works at about 1 Hz. Please refer to section 3.6.2 of the project report for more analysis of run times. # Citation To cite this project in your research: ``` @misc{semantic_slam, author = {Xuan, Zhang and David, Filliat}, title = {Real-time voxel based 3D semantic mapping with a hand held RGB-D camera}, year = {2018}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/floatlazer/semantic_slam}}, } ``` # Configuration You can change parameters for launch. Parameters are in `./semantic_slam/params` folder. ***Note that you can set octomap/tree_type and semantic_cloud/point_type to 0 to generate a map with rgb color without doing semantic segmantation.*** ### Parameters for octomap_generator node (octomap_generator.yaml) namespace octomap - pointcloud_topic - Topic of input point cloud topic - tree_type - OcTree type. 0 for ColorOcTree, 1 for SemanticsOcTree using max fusion (keep the most confident), 2 for SemanticsOcTree using bayesian fusion (fuse top 3 most confident semantic colors). See project report for details of fusion methods. - world_frame_id - Frame id of world frame. - resolution - Resolution of octomap, in meters. - max_range - Maximum distance of a point from camera to be inserted into octomap, in meters. - raycast_range - Maximum distance of a point from camera be perform raycasting to clear free space, in meters. - clamping_thres_min - Octomap parameter, minimum octree node occupancy during update. - clamping_thres_max - Octomap parameter, maximum octree node occupancy during update. - occupancy_thres - Octomap parameter, octree node occupancy to be considered as occupied - prob_hit - Octomap parameter, hitting probability of the sensor model. - prob_miss - Octomap parameter, missing probability of the sensor model. - save_path - Octomap saving path. (not tested) ### Parameters for semantic_cloud node (semantic_cloud.yaml) namespace camera - fx, fy, cx, cy - Camera intrinsic matrix parameters. - width, height - Image size. namespace semantic_pcl - color_image_topic - Topic for input color image. - depth_image_topic - Topic for input depth image. - point_type - Point cloud type, should be same as octomap/tree_type. 0 for color point cloud, 1 for semantic point cloud including top 3 most confident semanic colors and their confidences, 2 for semantic including most confident semantic color and its confident. See project report for details of point cloud types. - frame_id - Point cloud frame id. - dataset - Dataset on which PSPNet is trained. "ade20k" or "sunrgbd". - model_path - Path to pytorch trained model.