[Update:] I've further simplified the code to pytorch 1.5, torchvision 0.6, and replace the customized ops roipool and nms with the one from torchvision. if you want the old version code, please checkout branch v1.0 我进一步简化了pytorch 1.5, torchvision 0.6的代码,并用torchvision的代码替换了定制的roipool和nms。如果您想要旧版本代码,请签出分支v1.0
This project is a Simplified Faster R-CNN implementation based on chainercv and other projects . I hope it can serve as an start code for those who want to know the detail of Faster R-CNN. It aims to: 这个项目是一个基于chainercv和其他项目的简化更快的R-CNN实现。我希望它可以作为那些想知道更快的R-CNN细节的开始代码。它的目标是:
And it has the following features(具有以下特点:):
VGG16 train on trainval
and test on test
split.
VGG16在“trainval”上训练,在“test”上测试。
Note: the training shows great randomness, you may need a bit of luck and more epoches of training to reach the highest mAP. However, it should be easy to surpass the lower bound. **注:训练显示了很大的随机性,你可能需要一点运气和更多的训练时期,以达到最高的mAP。然而,它应该很容易超越下限。
Implementation | mAP |
---|---|
origin paper | 0.699 |
train with caffe pretrained model | 0.700-0.712 |
train with torchvision pretrained model | 0.685-0.701 |
model converted from chainercv (reported 0.706) | 0.7053 |
Implementation | GPU | Inference | Trainining |
---|---|---|---|
origin paper | K40 | 5 fps | NA |
This[1] | TITAN Xp | 14-15 fps | 6 fps |
pytorch-faster-rcnn | TITAN Xp | 15-17fps | 6fps |
[1]: make sure you install cupy correctly and only one program run on the GPU. The training speed is sensitive to your gpu status. see troubleshooting for more info. Morever it's slow in the start of the program -- it need time to warm up. [1]:确保你正确安装了cupy并且GPU上只有一个程序运行。训练速度对你的gpu状态很敏感。有关更多信息,请参阅故障排除。此外,它在程序开始时很慢——它需要时间来热身。
It could be faster by removing visualization, logging, averaging loss etc. 通过删除可视化、日志记录、平均损失等,它可以变得更快。
Here is an example of create environ from scratch with anaconda
下面是一个使用anaconda从零开始创建environ的示例
# create conda env
conda create --name simp python=3.7
conda activate simp
# install pytorch
conda install pytorch torchvision cudatoolkit=10.2 -c pytorch
# install other dependancy
pip install visdom scikit-image tqdm fire ipdb pprint matplotlib torchnet
# 解决安装错误问题
https://www.pythonheidong.com/blog/article/793721/748f4c139923d7da15df/
pip install pprint -i https://pypi.doubanio.com/simple/ --trusted-host pypi.douban.com
使用镜像
pip install torchnet -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install visdom -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install scikit-image -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install tqdm -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install fire -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install ipdb -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install pprint -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install matplotlib -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install torchnet -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com pip install numpy -i https://mirrors.aliyun.com/pypi/simple/ --trusted-host pypi.douban.com
# start visdom
nohup python -m visdom.server &
If you don't use anaconda, then: 如果你不使用anaconda,那么:
pip install visdom scikit-image tqdm fire ipdb pprint matplotlib torchnet
nohup python -m visdom.server &
Download pretrained model from Google Drive or Baidu Netdisk( passwd: scxn) 下载预训练模型 Google Drive or Baidu Netdisk( passwd: scxn)
See demo.ipynb for more detail.
Download the training, validation, test data and VOCdevkit(下载训练、验证、测试数据和VOCdevkit)
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar
wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCdevkit_08-Jun-2007.tar
Extract all of these tars into one directory named VOCdevkit
(将所有这些tar文件解压到一个名为“VOCdevkit”的目录中)
tar xvf VOCtrainval_06-Nov-2007.tar
tar xvf VOCtest_06-Nov-2007.tar
tar xvf VOCdevkit_08-Jun-2007.tar
It should have this basic structure
$VOCdevkit/ # development kit
$VOCdevkit/VOCcode/ # VOC utility code
$VOCdevkit/VOC2007 # image sets, annotations, etc.
# ... and several other directories ...
modify voc_data_dir
cfg item in utils/config.py
, or pass it to program using argument like --voc-data-dir=/path/to/VOCdevkit/VOC2007/
.
If you want to use caffe-pretrain model as initial weight, you can run below to get vgg16 weights converted from caffe, which is the same as the origin paper use. 如果您想使用caffe-pretrain模型作为初始权重,您可以在下面运行,从caffe转换为vgg16权重,这与原始纸的使用相同。
python misc/convert_caffe_pretrain.py
This scripts would download pretrained model and converted it to the format compatible with torchvision. If you are in China and can not download the pretrain model, you may refer to this issue
这个脚本将下载预先训练过的模型,并将其转换为与torchvision兼容的格式。如果您在中国,不能下载预训模型,您可以参考this issue
Then you could specify where caffe-pretraind model vgg16_caffe.pth
stored in utils/config.py
by setting caffe_pretrain_path
. The default path is ok.
然后你可以通过设置' caffe_pretrain_path '来指定caffe- pretrain_model ' vgg16_caffe.pth '存储在' utils/config.py '中的位置。默认路径没有问题。
If you want to use pretrained model from torchvision, you may skip this step.
如果你想使用torchvision预先训练过的模型,你可以跳过这一步。
NOTE, caffe pretrained model has shown slight better performance. 注,caffe预训练模型表现稍好。
NOTE: caffe model require images in BGR 0-255, while torchvision model requires images in RGB and 0-1. See data/dataset.py
for more detail.
注:caffe模型要求图像在BGR 0-255,而torchvision模型要求图像在RGB和0-1。更多细节请参见' data/dataset.py '。
python train.py train --env='fasterrcnn' --plot-every=100
you may refer to utils/config.py
for more argument.
更多参数可以参考' utils/config.py '。
Some Key arguments(一些关键参数:):
--caffe-pretrain=False
: use pretrain model from caffe or torchvision (Default: torchvison)--plot-every=n
: visualize prediction, loss etc every n
batches.--env
: visdom env for visualization--voc_data_dir
: where the VOC data stored
—‘——voc_data_dir’:存放VOC数据的位置--use-drop
: use dropout in RoI head, default False--use-Adam
: use Adam instead of SGD, default SGD. (You need set a very low lr
for Adam)--load-path
: pretrained model path, default None
, if it's specified, it would be loaded.you may open browser, visit http://<ip>:8097
and see the visualization of training procedure as below:
您可以打开浏览器,访问“http://:8097”,查看如下可视化培训过程:
dataloader: received 0 items of ancdata
see discussion, It's alreadly fixed in train.py. So I think you are free from this problem. 参见discussion,它已经在train.py中固定。所以我认为你没有这个问题。
Windows support
I don't have windows machine with GPU to debug and test it. It's welcome if anyone could make a pull request and test it. 我没有带GPU的windows机器来调试和测试它。欢迎任何人提出pull请求并进行测试。
This work builds on many excellent works, which include: 这项工作建立在许多优秀的作品,其中包括:
Licensed under MIT, see the LICENSE for more detail. 在MIT下许可,请参阅许可证了解更多细节。 Contribution Welcome. 贡献的欢迎。 If you encounter any problem, feel free to open an issue, but too busy lately. 如果你遇到任何问题,可以随意打开一个问题,但最近太忙了。 Correct me if anything is wrong or unclear. 如果有什么不对或不清楚的地方请纠正我。 model structure 模型结构
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。