# rexnet **Repository Path**: sunxiangkang/rexnet ## Basic Information - **Project Name**: rexnet - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2021-03-19 - **Last Updated**: 2021-03-19 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README #### (NOTICE) Our paper has been accepted at CVPR 2021!! The submitted paper will be updated at arxiv! #### (NOTICE) New models ReXNet-Lites which outperform EfficientNet-Lites will be uploaded soon! ## ReXNet: Diminishing Representational Bottleneck on Convolutional Neural Network **Dongyoon Han, Sangdoo Yun, Byeongho Heo, and YoungJoon Yoo** | [Paper](https://arxiv.org/pdf/2007.00992.pdf) | [Pretrained Models](#pretrained) AI LAB, NAVER Corp. ## Abstract This paper addresses representational bottleneck in a network and propose a set of design principles that improves model performance significantly. We argue that a representational bottleneck may happen in a network designed by a conventional design and results in degrading the model performance. To investigate the representational bottleneck, we study the matrix rank of the features generated by ten thousand random networks. We further study the entire layer's channel configuration towards designing more accurate network architectures. Based on the investigation, we propose simple yet effective design principles to mitigate the representational bottleneck. Slight changes on baseline networks by following the principle leads to achieving remarkable performance improvements on ImageNet classification. Additionally, COCO object detection results and transfer learning results on several datasets provide other backups of the link between diminishing representational bottleneck of a network and improving performance. ## ReXNets vs. EfficientNets ### Accuracy vs computational costs

### Actual performance scores - The CPU latencies are tested on Xeon E5-2630_v4 with a single image and the GPU latencies iare measured on M40 PGUs with **the batchsize of 64**. - EfficientNets' scores are taken form [arxiv v3 of the paper](https://arxiv.org/pdf/1905.11946v3.pdf). Model | Input Res. | Top-1 acc. | Top-5 acc. | FLOPs/params. | CPU Lat./ GPU Lat. :--: |:--:|:--:|:--:|:--:|:--:| EfficientNet-B0 | 224x224 | 77.3 | 93.5 | 0.39B/5.3M | 47ms/71ms **ReXNet_1.0** | 224x224 | 77.9 | 93.9 | 0.40B/4.8M | 47ms/68ms ||||| EfficientNet-B1 | 240x240 | 79.2 | 94.5 | 0.70B/7.8M | 70ms/112ms **ReXNet_1.3** | 224x224 | 79.5 | 94.7| 0.66B/7.6M | 55ms/84ms ||||| EfficientNet-B2 | 260x260 | 80.3 | 95.0 | 1.0B/9.2M | 77ms/141ms **ReXNet_1.5** | 224x224 | 80.3 | 95.2| 0.88B/9.7M | 59ms/92ms ||||| EfficientNet-B3 | 300x300 | 81.7 | 95.6 | 1.8B/12M | 100ms/223ms **ReXNet_2.0** | 224x224 | 81.6 | 95.7 | 1.8B/19M | 69ms/118ms ## Model performances

ImageNet classification results

- Please refer the following pretrained models. Top-1 and top-5 accuraies are reported with the computational costs. - Note that all the models are trained and evaluated with 224x224 image size. Model | Input Res. | Top-1 acc. | Top-5 acc. | FLOPs/params | :--: |:--:|:--:|:--:|:--: [ReXNet_1.0](https://drive.google.com/file/d/1xeIJ3wb83uOowU008ykYj6wDX2dsncA9/view?usp=sharing) | 224x224 | 77.9 | 93.9 | 0.40B/4.8M | [ReXNet_1.3](https://drive.google.com/file/d/1x2ziK9Oyv66Y9NsxJxXsdjzpQF2uSJj0/view?usp=sharing) | 224x224 | 79.5 | 94.7 | 0.66B/7.6M | [ReXNet_1.5](https://drive.google.com/file/d/1TOBGsbDhTHWBgqcRnyKIR0tHsJTOPUIG/view?usp=sharing) | 224x224 | 80.3 | 95.2 | 0.66B/7.6M | [ReXNet_2.0](https://drive.google.com/file/d/1R1aOTKIe1Mvck86NanqcjWnlR8DY-Z4C/view?usp=sharing) | 224x224 | 81.6 | 95.7 | 1.5B/16M | [ReXNet_3.0](https://drive.google.com/file/d/1chOnQPKtE1LaLz6WzSdzH55x6VlZ29sQ/view?usp=sharing) | 224x224 | 82.8 | 96.2 | 3.4B/34M | ### Finetuning results #### COCO Object detection - The following results are trained with **Faster RCNN with FPN**: | Backbone |Img. Size| B_AP (%) | B_AP_0.5 (%) | B_AP_0.75 (%) | Params. |FLOPs | Eval. set| |:----:|:----:|:----:|:----:|:----:|:---:|:---:|:---:| | FBNet-C-FPN | 1200x800 | 35.1 | 57.4 | 37.2 | 21.4M | 119.0B | val2017 | | EfficientNetB0-FPN | 1200x800 | 38.0 | 60.1 | 40.4 | 21.0M | 123.0B | val2017| | ReXNet_0.9-FPN | 1200x800 | 38.0 | **60.6** | 40.8 | 20.1M | 123.0B | val2017| | ReXNet_1.0-FPN | 1200x800 | **38.5** | **60.6** | **41.5** | 20.7M | 124.1B | val2017| ||||||||| | ResNet50-FPN | 1200x800 | 37.6| 58.2| 40.9 | 41.8M | 202.2B | val2017| | ResNeXt-101-FPN | 1200x800 | 40.3 | 62.1 | 44.1 | 60.4M | 272.4B | val2017| | ReXNet_2.2-FPN | 1200x800| **41.5** | **64.0** | **44.9** | 33.0M | 153.8B | val2017| #### COCO instance segmentation - The following results are trained with **Mask RCNN with FPN**, S_AP and B_AP denote segmentation AP and box AP, respectively: | Backbone |Img. Size| S_AP (%) | S_AP_0.5 (%) | S_AP_0.75 (%) | B_AP (%) | B_AP_0.5 (%) | B_AP_0.75 (%) | Params. |FLOPs | Eval. set| |:----:|:----:|:----:|:----:|:----:|:---:|:---:|:---:|:---:|:---:|:---:| | EfficientNetB0_FPN | 1200x800 | 34.8 | 56.8 | 36.6 | 38.4 | 60.2 | 40.8 | 23.7M | 123.0B | val2017| | ReXNet_0.9-FPN | 1200x800 | **35.2** | **57.4**| **37.1** |**38.7** |**60.8**|**41.6**| 22.8M | 123.0B | val2017| | ReXNet_1.0-FPN | 1200x800 | 35.4 | 57.7 | 37.4 | 38.9 |61.1 | 42.1 | 23.3M | 124.1B | val2017| |||||||||||| | ResNet50-FPN | 1200x800 | 34.6 | 55.9 | 36.8 |38.5 |59.0|41.6| 44.2M | 207B | val2017| | ReXNet_2.2-FPN | 1200x800 | **37.8** | **61.0** | **40.2** | **42.0** | **64.5** | **45.6**| 35.6M | 153.8B | val2017| ### Transfer learning results - Using ImageNet-pretrained models to transfer on the fine-grained datasets:

### ReXNet-lites vs. EfficientNet-lites #### Actual performance scores - We compare ReXNet-lites with [EfficientNet-lites](https://github.com/tensorflow/tpu/tree/master/models/official/efficientnet/lite). Model | Input Res. | Top-1 acc. | Top-5 acc. | FLOPs/params | CPU Lat./ GPU Lat. :--: |:--:|:--:|:--:|:--:|:--:| EfficientNet-lite0 | 224x224 | 75.1 | - | 0.41B/4.7M | 30ms/49ms **ReXNet-lite_1.0** | 224x224 | 76.2 | 92.8 | 0.41B/4.7M | 31ms/49ms ||||| EfficientNet-lite1 | 240x240 | 76.7 | - | 0.63B/5.4M | 44ms/73ms **ReXNet-lite_1.3** | 224x224 | 77.8 | 93.8 | 0.65B/6.8M | 36ms/61ms ||||| EfficientNet-lite2 | 260x260 | 77.6 | - | 0.90B/ 6.1M | 48ms/93ms **ReXNet-lite_1.5** | 224x224 | 78.6 | 94.2| 0.84B/8.3M| 39ms/68ms ||||| EfficientNet-lite3 | 280x280| 79.8 | - | 1.4B/ 8.2M | 60ms/131ms **ReXNet-lite_2.0** | 224x224 | 80.2 | 95.0 | 1.5B/13M | 49ms/90ms ## Getting Started ### Requirements - Python3 - PyTorch (> 1.0) - Torchvision (> 0.2) - NumPy ### Using the pretrained models - Usage is the same as the other models officially released in pytorch [Torchvision](https://pytorch.org/docs/stable/torchvision/models.html). - Using models in GPUs: ```python import torch import rexnetv1 model = rexnetv1.ReXNetV1(width_mult=1.0).cuda() model.load_state_dict(torch.load('./rexnetv1_1.0x.pth')) model.eval() print(model(torch.randn(1, 3, 224, 224).cuda())) ``` - For CPUs: ```python import torch import rexnetv1 model = rexnetv1.ReXNetV1(width_mult=1.0) model.load_state_dict(torch.load('./rexnetv1_1.0x.pth', map_location=torch.device('cpu'))) model.eval() print(model(torch.randn(1, 3, 224, 224))) ``` ### Training own ReXNet ReXNet can be trained with any PyTorch training codes including [ImageNet training in PyTorch](https://github.com/pytorch/examples/tree/master/imagenet) with the model file and proper arguments. Since the provided model file is not complicated, we simply convert the model to train a ReXNet in other frameworks like MXNet. For MXNet, we recommend [MXnet-gluoncv](https://gluon-cv.mxnet.io/model_zoo/classification.html) as a training code. Using PyTorch, we trained ReXNets with one of the popular imagenet classification code, rwightman's [pytorch-image-models](https://github.com/rwightman/pytorch-image-models) for more efficient training. After including ReXNet's model file into the training code, one can train ReXNet-1.0x with the following command line: ./distributed_train.sh 4 /imagenet/ --model rexnetv1 --rex-width-mult 1.0 --opt sgd --amp \ --lr 0.5 --weight-decay 1e-5 \ --batch-size 128 --epochs 400 --sched cosine \ --remode pixel --reprob 0.2 --drop 0.2 --aa rand-m9-mstd0.5 ## License This project is distributed under [MIT license](LICENSE). ## How to cite ``` @article{han2020rexnet, title={{ReXNet}: Diminishing Representational Bottleneck on Convolutional Neural Network }, author={Han, Dongyoon and Yun, Sangdoo and Heo, Byeongho and Yoo, YoungJoon}, year={2020}, journal={arXiv preprint arXiv:2007.00992}, } ```