A Simple and Fast Implementation of Faster R-CNN

1. Introduction

[Update:] I've further simplified the code to pytorch 1.5, torchvision 0.6, and replace the customized ops roipool and nms with the one from torchvision. if you want the old version code, please checkout branch v1.0 我进一步简化了pytorch 1.5, torchvision 0.6的代码，并用torchvision的代码替换了定制的roipool和nms。如果您想要旧版本代码，请签出分支v1.0

This project is a Simplified Faster R-CNN implementation based on chainercv and other projects . I hope it can serve as an start code for those who want to know the detail of Faster R-CNN. It aims to: 这个项目是一个基于chainercv和其他项目的简化更快的R-CNN实现。我希望它可以作为那些想知道更快的R-CNN细节的开始代码。它的目标是:

Simplify the code (Simple is better than complex)
Make the code more straightforward (Flat is better than nested)
Match the performance reported in origin paper (Speed Counts and mAP Matters)
简化代码(简单总比复杂好)
使代码更直观(扁平比嵌套更好)
与原产地文件所报告的性能相符(速度计数和地图事项)

And it has the following features(具有以下特点:):

It can be run as pure Python code, no more build affair.
它可以作为纯Python代码运行，没有更多的构建事务。
It's a minimal implemention in around 2000 lines valid code with a lot of comment and instruction.(thanks to chainercv's excellent documentation)
这是一个最小的实现，大约2000行有效代码，有很多注释和指令。(感谢chainercv的优秀文档)
It achieves higher mAP than the origin implementation (0.712 VS 0.699)
它实现了比原始实现更高的mAP (0.712 VS 0.699)
It achieve speed compariable with other implementation (6fps and 14fps for train and test in TITAN XP)
它的速度可以和其他版本媲美(TITAN XP的火车和测试版本是6fps和14fps)
It's memory-efficient (about 3GB for vgg16)
这是节省内存

2. Performance（性能）

2.1 mAP

VGG16 train on trainval and test on test split. VGG16在“trainval”上训练，在“test”上测试。

Note: the training shows great randomness, you may need a bit of luck and more epoches of training to reach the highest mAP. However, it should be easy to surpass the lower bound. **注:训练显示了很大的随机性，你可能需要一点运气和更多的训练时期，以达到最高的mAP。然而，它应该很容易超越下限。

Implementation	mAP
origin paper	0.699
train with caffe pretrained model	0.700-0.712
train with torchvision pretrained model	0.685-0.701
model converted from chainercv (reported 0.706)	0.7053

2.2 Speed

Implementation	GPU	Inference	Trainining
origin paper	K40	5 fps	NA
This[1]	TITAN Xp	14-15 fps	6 fps
pytorch-faster-rcnn	TITAN Xp	15-17fps	6fps

[1]: make sure you install cupy correctly and only one program run on the GPU. The training speed is sensitive to your gpu status. see troubleshooting for more info. Morever it's slow in the start of the program -- it need time to warm up. [1]:确保你正确安装了cupy并且GPU上只有一个程序运行。训练速度对你的gpu状态很敏感。有关更多信息，请参阅故障排除。此外，它在程序开始时很慢——它需要时间来热身。

It could be faster by removing visualization, logging, averaging loss etc. 通过删除可视化、日志记录、平均损失等，它可以变得更快。

3. Install dependencies（安装依赖关系）

Here is an example of create environ from scratch with anaconda 下面是一个使用anaconda从零开始创建environ的示例

# create conda env
conda create --name simp python=3.7
conda activate simp
# install pytorch
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch

# install other dependancy
pip install visdom scikit-image tqdm fire ipdb pprint matplotlib torchnet
# 解决安装错误问题
https://www.pythonheidong.com/blog/article/793721/748f4c139923d7da15df/
pip install pprint -i https://pypi.doubanio.com/simple/ --trusted-host pypi.douban.com
使用镜像

pip install torchnet -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install visdom -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install scikit-image -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install tqdm -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install fire -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install ipdb -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install pprint -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install matplotlib -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install torchnet -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install numpy -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com

# start visdom
nohup python -m visdom.server &

If you don't use anaconda, then: 如果你不使用anaconda，那么:

install PyTorch with GPU (code are GPU-only), refer to official website
使用GPU安装PyTorch(代码仅支持GPU)，请参考官方网站
install other dependencies: pip install visdom scikit-image tqdm fire ipdb pprint matplotlib torchnet
安装其他依赖项:' pip install visdom scikit-image TQDM fire ipdb pprint matplotlib torchnet '
start visdom for visualization
启动visdom可视化

nohup python -m visdom.server &

4. Demo

Download pretrained model from Google Drive or Baidu Netdisk( passwd: scxn) 下载预训练模型 Google Drive or Baidu Netdisk( passwd: scxn)

See demo.ipynb for more detail.

5. Train(训练)

5.1 Prepare data（准备数据）

Pascal VOC2007

Download the training, validation, test data and VOCdevkit（下载训练、验证、测试数据和VOCdevkit）

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar

Extract all of these tars into one directory named VOCdevkit （将所有这些tar文件解压到一个名为“VOCdevkit”的目录中）
```
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
```

It should have this basic structure

$VOCdevkit/                           # development kit
$VOCdevkit/VOCcode/                   # VOC utility code
$VOCdevkit/VOC2007                    # image sets, annotations, etc.
# ... and several other directories ...

modify voc_data_dir cfg item in utils/config.py, or pass it to program using argument like --voc-data-dir=/path/to/VOCdevkit/VOC2007/ .

5.2 [Optional]Prepare caffe-pretrained vgg16(准备caffe与训练vgg16)

If you want to use caffe-pretrain model as initial weight, you can run below to get vgg16 weights converted from caffe, which is the same as the origin paper use. 如果您想使用caffe-pretrain模型作为初始权重，您可以在下面运行，从caffe转换为vgg16权重，这与原始纸的使用相同。

python misc/convert_caffe_pretrain.py

This scripts would download pretrained model and converted it to the format compatible with torchvision. If you are in China and can not download the pretrain model, you may refer to this issue 这个脚本将下载预先训练过的模型，并将其转换为与torchvision兼容的格式。如果您在中国，不能下载预训模型，您可以参考this issue Then you could specify where caffe-pretraind model vgg16_caffe.pth stored in utils/config.py by setting caffe_pretrain_path. The default path is ok. 然后你可以通过设置' caffe_pretrain_path '来指定caffe- pretrain_model ' vgg16_caffe.pth '存储在' utils/config.py '中的位置。默认路径没有问题。 If you want to use pretrained model from torchvision, you may skip this step. 如果你想使用torchvision预先训练过的模型，你可以跳过这一步。

NOTE, caffe pretrained model has shown slight better performance. 注，caffe预训练模型表现稍好。

NOTE: caffe model require images in BGR 0-255, while torchvision model requires images in RGB and 0-1. See data/dataset.pyfor more detail. 注:caffe模型要求图像在BGR 0-255，而torchvision模型要求图像在RGB和0-1。更多细节请参见' data/dataset.py '。

5.3 begin training(开始训练)

python train.py train --env='fasterrcnn' --plot-every=100

you may refer to utils/config.py for more argument. 更多参数可以参考' utils/config.py '。 Some Key arguments（一些关键参数:）:

--caffe-pretrain=False: use pretrain model from caffe or torchvision (Default: torchvison)
'——caffe-pretrain=False ':使用来自caffe或torchvision的pretrain模型(默认值:torchvision)
--plot-every=n: visualize prediction, loss etc every n batches.
'——plot-every=n ':可视化预测，损失等每' n '批。
--env: visdom env for visualization
'——env ': visdom env用于可视化
--voc_data_dir: where the VOC data stored —‘——voc_data_dir’:存放VOC数据的位置
--use-drop: use dropout in RoI head, default False
'——use-drop ':在RoI头部使用dropout，默认为False
--use-Adam: use Adam instead of SGD, default SGD. (You need set a very low lr for Adam)
'——use-Adam ':使用Adam代替SGD，默认SGD。(你需要为亚当设定一个非常低的lr)
--load-path: pretrained model path, default None, if it's specified, it would be loaded.
'——load-path ':预先训练的模型路径，默认为' None '，如果指定了，则会加载。

you may open browser, visit http://<ip>:8097 and see the visualization of training procedure as below: 您可以打开浏览器，访问“http://:8097”，查看如下可视化培训过程: visdom

Troubleshooting（故障排除）

dataloader: received 0 items of ancdata

see discussion, It's alreadly fixed in train.py. So I think you are free from this problem. 参见discussion，它已经在train.py中固定。所以我认为你没有这个问题。
Windows support

I don't have windows machine with GPU to debug and test it. It's welcome if anyone could make a pull request and test it. 我没有带GPU的windows机器来调试和测试它。欢迎任何人提出pull请求并进行测试。

Acknowledgement（确认）

This work builds on many excellent works, which include: 这项工作建立在许多优秀的作品，其中包括:

Yusuke Niitani's ChainerCV (mainly)
Ruotian Luo's pytorch-faster-rcnn which based on Xinlei Chen's tf-faster-rcnn
faster-rcnn.pytorch by Jianwei Yang and Jiasen Lu.It mainly refer to longcw's faster_rcnn_pytorch
All the above Repositories have referred to py-faster-rcnn by Ross Girshick and Sean Bell either directly or indirectly.

^_^

Licensed under MIT, see the LICENSE for more detail. 在MIT下许可，请参阅许可证了解更多细节。 Contribution Welcome. 贡献的欢迎。 If you encounter any problem, feel free to open an issue, but too busy lately. 如果你遇到任何问题，可以随意打开一个问题，但最近太忙了。 Correct me if anything is wrong or unclear. 如果有什么不对或不清楚的地方请纠正我。 model structure 模型结构

yuzhimin999/simple-faster-rcnn-pytorch

A Simple and Fast Implementation of Faster R-CNN

1. Introduction

2. Performance（性能）

2.1 mAP

2.2 Speed

3. Install dependencies（安装依赖关系）

4. Demo

5. Train(训练)

5.1 Prepare data（准备数据）

Pascal VOC2007

5.2 [Optional]Prepare caffe-pretrained vgg16(准备caffe与训练vgg16)

5.3 begin training(开始训练)

Troubleshooting（故障排除）

Acknowledgement（确认）

^_^

简介

发行版

贡献者

语言

近期动态

yuzhimin999/simple-faster-rcnn-pytorch .gitee-modal { width: 500px !important; }

A Simple and Fast Implementation of Faster R-CNN

1. Introduction

2. Performance（性能）

2.1 mAP

2.2 Speed

3. Install dependencies（安装依赖关系）

4. Demo

5. Train(训练)

5.1 Prepare data（准备数据）

Pascal VOC2007

5.2 [Optional]Prepare caffe-pretrained vgg16(准备caffe与训练vgg16)

5.3 begin training(开始训练)

Troubleshooting（故障排除）

Acknowledgement（确认）

^_^

简介

发行版

贡献者

语言

近期动态

搜索帮助

yuzhimin999/simple-faster-rcnn-pytorch