# Pytorch-HarDNet
**Repository Path**: mabeisi/Pytorch-HarDNet
## Basic Information
- **Project Name**: Pytorch-HarDNet
- **Description**: 35% faster than ResNet: Harmonic DenseNet, A low memory traffic network
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2021-07-07
- **Last Updated**: 2021-07-07
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# Pytorch-HarDNet
### [Harmonic DenseNet: A low memory traffic network (ICCV 2019 paper)](https://arxiv.org/abs/1909.00948)
### See also [CenterNet-HarDNet](https://github.com/PingoLH/CenterNet-HarDNet) for Object Detection in 44.3 mAP / 45 fps on COCO dataset
### and [FC-HarDNet](https://github.com/PingoLH/FCHarDNet) for Semantic Segmentation
* Fully utilize your cuda cores!
* Unlike CNN models using a lot of Conv1x1 to reduce model size and number of MACs, HarDNet mainly uses Conv3x3 (with only one Conv1x1 layer for each HarDNet block) to increase the computational density.
* Increased computational density changes a model from Memory-Bound to Compute-Bound
## Architecture
#### HarDNet Block:
- k = growth rate (as in DenseNet)
- m = channel weighting factor (1.6~1.7)
- Conv3x3 for all layers (no bottleneck layer)
- Conv-BN-ReLU for all layers intead of BN-ReLU-Conv used in DenseNet
- See [MIPT-Oulu/pytorch_bn_fusion](https://github.com/MIPT-Oulu/pytorch_bn_fusion) to get rid of BatchNorm for inference.
- No global dense connection (input of a HarDBlk is NOT reused as a part of output)
#### HarDNet68/85:
- Enhanced local feature extraction to benefit the detection of small objects
- A transitional Conv1x1 layer is employed after each HarDNet block (HarDBlk)
## Results
| Method | MParam | GMACs | Inference
Time* | ImageNet
Top-1 | COCO mAP
with SSD512 |
| :---: | :---: | :---: | :---: | :---: | :---: |
| **HarDNet68** | 17.6 | 4.3 | 22.5 ms | 76.5 | 31.7 |
| ResNet-50 | 25.6 | 4.1 | 31.0 ms | 76.2 | - |
| **HarDNet85** | 36.7 | 9.1 | 38.0 ms | 78.0 | 35.1 |
| ResNet-101 | 44.6 | 7.8 | 51.2 ms | 78.0 | 31.2 |
| VGG-16 | 138 | 15.5 | 49 ms | 73.4 | 28.8 |
\* Inference time measured on an NVidia 1080ti with pytorch 1.1.0\
300 iteraions of random 1024x1024 input images are averaged.
## Results of Depthwise Separable (DS) version of HarDNet
| Method | MParam | GMACs | Inference
Time** | ImageNet
Top-1 |
| :---: | :---: | :---: | :---: | :---: |
| **HarDNet39DS** | 3.5 | 0.44 | 32.5 ms | 72.1 |
| MobileNetV2 | 3.5 | 0.3 | 37.9 ms | 72.0 |
| **HarDNet68DS** | 4.2 | 0.8 | 52.6 ms | 74.3 |
| MobileNetV2 1.4x | 6.1 | 0.6 | 57.8 ms | 74.7 |
\** Inference time measured on an NVidia Jetson nano with TensorRT\
500 iteraions of random 320x320 input images are averaged.
## Train HarDNet models for ImageNet
Training prodedure is branched from https://github.com/pytorch/examples/tree/master/imagenet
Training:
```
python main.py -a hardnet68 [imagenet-folder with train and val folders]
arch = hardnet39ds | hardnet68ds | hardnet68 | hardnet85
```
Evaluating:
```
python main.py -a hardnet68 --pretrained -e [imagenet-folder with train and val folders]
```
for HarDNet85, please download pretrained weights from [here](https://drive.google.com/file/d/1I-qbZtpVlWbRyz1c3lT7rg2IqxCl28at/view?usp=sharing)
### Hyperparameters
- epochs 150 ~ 250
- initial lr = 0.05
- batch size = 256
- weight decay = 6e-5
- cosine learning rate decay
- nestrov = True