同步操作将从 PaddlePaddle/PaddleDetection 强制同步,此操作会覆盖自 Fork 仓库以来所做的任何修改,且无法恢复!!!
确定后同步将在后台操作,完成时将刷新页面,请耐心等待。
English | 简体中文
We developed a series of lightweight models, named PP-PicoDet
. Because of the excellent performance, our models are very suitable for deployment on mobile or CPU. For more details, please refer to our report on arXiv.
Model | Input size | mAPval 0.5:0.95 |
mAPval 0.5 |
Params (M) |
FLOPS (G) |
LatencyCPU (ms) |
LatencyLite (ms) |
Weight | Config | Inference Model |
---|---|---|---|---|---|---|---|---|---|---|
PicoDet-XS | 320*320 | 23.5 | 36.1 | 0.70 | 0.67 | 3.9ms | 7.81ms | model | log | config | w/ postprocess | w/o postprocess |
PicoDet-XS | 416*416 | 26.2 | 39.3 | 0.70 | 1.13 | 6.1ms | 12.38ms | model | log | config | w/ postprocess | w/o postprocess |
PicoDet-S | 320*320 | 29.1 | 43.4 | 1.18 | 0.97 | 4.8ms | 9.56ms | model | log | config | w/ postprocess | w/o postprocess |
PicoDet-S | 416*416 | 32.5 | 47.6 | 1.18 | 1.65 | 6.6ms | 15.20ms | model | log | config | w/ postprocess | w/o postprocess |
PicoDet-M | 320*320 | 34.4 | 50.0 | 3.46 | 2.57 | 8.2ms | 17.68ms | model | log | config | w/ postprocess | w/o postprocess |
PicoDet-M | 416*416 | 37.5 | 53.4 | 3.46 | 4.34 | 12.7ms | 28.39ms | model | log | config | w/ postprocess | w/o postprocess |
PicoDet-L | 320*320 | 36.1 | 52.0 | 5.80 | 4.20 | 11.5ms | 25.21ms | model | log | config | w/ postprocess | w/o postprocess |
PicoDet-L | 416*416 | 39.4 | 55.7 | 5.80 | 7.10 | 20.7ms | 42.23ms | model | log | config | w/ postprocess | w/o postprocess |
PicoDet-L | 640*640 | 42.6 | 59.2 | 5.80 | 16.81 | 62.5ms | 108.1ms | model | log | config | w/ postprocess | w/o postprocess |
Intel core i7 10750H
CPU with MKLDNN by 12 threads and Qualcomm Snapdragon 865(4xA77+4xA55)
with 4 threads by arm8 and with FP16. In the above table, test CPU latency on Paddle-Inference and testing Mobile latency with Lite
->Paddle-Lite.-o export.benchmark=True
or manually modify runtime.yml.Model | Input size | mAPval 0.5:0.95 |
mAPval 0.5 |
Params (M) |
FLOPS (G) |
LatencyNCNN (ms) |
---|---|---|---|---|---|---|
YOLOv3-Tiny | 416*416 | 16.6 | 33.1 | 8.86 | 5.62 | 25.42 |
YOLOv4-Tiny | 416*416 | 21.7 | 40.2 | 6.06 | 6.96 | 23.69 |
PP-YOLO-Tiny | 320*320 | 20.6 | - | 1.08 | 0.58 | 6.75 |
PP-YOLO-Tiny | 416*416 | 22.7 | - | 1.08 | 1.02 | 10.48 |
Nanodet-M | 320*320 | 20.6 | - | 0.95 | 0.72 | 8.71 |
Nanodet-M | 416*416 | 23.5 | - | 0.95 | 1.2 | 13.35 |
Nanodet-M 1.5x | 416*416 | 26.8 | - | 2.08 | 2.42 | 15.83 |
YOLOX-Nano | 416*416 | 25.8 | - | 0.91 | 1.08 | 19.23 |
YOLOX-Tiny | 416*416 | 32.8 | - | 5.06 | 6.45 | 32.77 |
YOLOv5n | 640*640 | 28.4 | 46.0 | 1.9 | 4.5 | 40.35 |
YOLOv5s | 640*640 | 37.2 | 56.0 | 7.2 | 16.5 | 78.05 |
# training on single-GPU
export CUDA_VISIBLE_DEVICES=0
python tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval
If the GPU is out of memory during training, reduce the batch_size in TrainReader, and reduce the base_lr in LearningRate proportionally. At the same time, the configs we published are all trained with 4 GPUs. If the number of GPUs is changed to 1, the base_lr needs to be reduced by a factor of 4.
# training on multi-GPU
export CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
python -m paddle.distributed.launch --gpus 0,1,2,3,4,5,6,7 tools/train.py -c configs/picodet/picodet_s_320_coco_lcnet.yml --eval
python tools/eval.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
-o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams
python tools/infer.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
-o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams
Detail also can refer to Quick start guide.
cd PaddleDetection
python tools/export_model.py -c configs/picodet/picodet_s_320_coco_lcnet.yml \
-o weights=https://paddledet.bj.bcebos.com/models/picodet_s_320_coco_lcnet.pdparams \
--output_dir=output_inference
-o export.benchmark=True
(if -o has already appeared, delete -o here) or manually modify corresponding fields in runtime.yml.-o export.nms=True
or manually modify corresponding fields in runtime.yml. Many scenes exported to ONNX only support single input and fixed shape output, so if exporting to ONNX, it is recommended not to export NMS.pip install paddlelite
# FP32
paddle_lite_opt --model_dir=output_inference/picodet_s_320_coco_lcnet --valid_targets=arm --optimize_out=picodet_s_320_coco_fp32
# FP16
paddle_lite_opt --model_dir=output_inference/picodet_s_320_coco_lcnet --valid_targets=arm --optimize_out=picodet_s_320_coco_fp16 --enable_fp16=true
pip install onnx
pip install paddle2onnx==0.9.2
paddle2onnx --model_dir output_inference/picodet_s_320_coco_lcnet/ \
--model_filename model.pdmodel \
--params_filename model.pdiparams \
--opset_version 11 \
--save_file picodet_s_320_coco.onnx
Simplify ONNX model: use onnx-simplifier to simplify onnx model.
pip install onnxsim
onnxsim picodet_s_320_coco.onnx picodet_s_processed.onnx
Model | Input size | ONNX(w/o postprocess) | Paddle Lite(fp32) | Paddle Lite(fp16) |
---|---|---|---|---|
PicoDet-XS | 320*320 | ( w/ postprocess) | ( w/o postprocess) | model | model |
PicoDet-XS | 416*416 | ( w/ postprocess) | ( w/o postprocess) | model | model |
PicoDet-S | 320*320 | ( w/ postprocess) | ( w/o postprocess) | model | model |
PicoDet-S | 416*416 | ( w/ postprocess) | ( w/o postprocess) | model | model |
PicoDet-M | 320*320 | ( w/ postprocess) | ( w/o postprocess) | model | model |
PicoDet-M | 416*416 | ( w/ postprocess) | ( w/o postprocess) | model | model |
PicoDet-L | 320*320 | ( w/ postprocess) | ( w/o postprocess) | model | model |
PicoDet-L | 416*416 | ( w/ postprocess) | ( w/o postprocess) | model | model |
PicoDet-L | 640*640 | ( w/ postprocess) | ( w/o postprocess) model | model |
Infer Engine | Python | C++ | Predict With Postprocess |
---|---|---|---|
OpenVINO | Python | C++(postprocess coming soon) | ✔︎ |
Paddle Lite | - | C++ | ✔︎ |
Android Demo | - | Paddle Lite | ✔︎ |
PaddleInference | Python | C++ | ✔︎ |
ONNXRuntime | Python | Coming soon | ✔︎ |
NCNN | Coming soon | C++ | ✘ |
MNN | Coming soon | C++ | ✘ |
Android demo visualization:
Install:
pip install paddleslim==2.2.2
Configure the quant config and start training:
python tools/train.py -c configs/picodet/picodet_s_416_coco_lcnet.yml \
--slim_config configs/slim/quant/picodet_s_416_lcnet_quant.yml --eval
Quant Model | Input size | mAPval 0.5:0.95 |
Configs | Weight | Inference Model | Paddle Lite(INT8) |
---|---|---|---|---|---|---|
PicoDet-S | 416*416 | 31.5 | config | slim config | model | w/ postprocess | w/o postprocess | w/ postprocess | w/o postprocess |
Please refer this documentation for details such as requirements, training and deployment.
Pedestrian detection: model zoo of PicoDet-S-Pedestrian
please refer to PP-TinyPose
Mainbody detection: model zoo of PicoDet-L-Mainbody
please refer to mainbody detection
Please reduce the batch_size
of TrainReader
in config.
Please reset pretrain_weights
in config, which trained on coco. Such as:
pretrain_weights: https://paddledet.bj.bcebos.com/models/picodet_l_640_coco_lcnet.pdparams
Please use PicoDet-LCNet
model, which has fewer transpose
operators.
You can insert below code at here to count learnable parameters.
params = sum([
p.numel() for n, p in self.model. named_parameters()
if all([x not in n for x in ['_mean', '_variance']])
]) # exclude BatchNorm running status
print('params: ', params)
If you use PicoDet in your research, please cite our work by using the following BibTeX entry:
@misc{yu2021pppicodet,
title={PP-PicoDet: A Better Real-Time Object Detector on Mobile Devices},
author={Guanghua Yu and Qinyao Chang and Wenyu Lv and Chang Xu and Cheng Cui and Wei Ji and Qingqing Dang and Kaipeng Deng and Guanzhong Wang and Yuning Du and Baohua Lai and Qiwen Liu and Xiaoguang Hu and Dianhai Yu and Yanjun Ma},
year={2021},
eprint={2111.00902},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。