This paper studies the problem of designing compact binary architectures for vision multi-layer perceptrons (MLPs). We provide extensive analysis on the difficulty of binarizing vision MLPs and find that previous binarization methods perform poorly due to limited capacity of binary MLPs. In contrast with the traditional CNNs that utilizing convolutional operations with large kernel size, fully-connected (FC) layers in MLPs can be treated as convolutional layers with kernel size 1×1. Thus, the representation ability of the FC layers will be limited when being binarized, and places restrictions on the capability of spatial mixing and channel mixing on the intermediate features. To this end, we propose to improve the performance of binary MLP (BiMLP) model by enriching the representation ability of binary FC layers. We design a novel binary block that contains multiple branches to merge a series of outputs from the same stage, and also a universal shortcut connection that encourages the information flow from the previous stage. The downsampling layers are also carefully designed to reduce the computational complexity while maintaining the classification performance. Experimental results on benchmark dataset ImageNet-1k demonstrate the effectiveness of the proposed BiMLP models, which achieve state-of-the-art accuracy compared to prior binary CNNs.
Paper: Yixing Xu, Xinghao Chen, Yunhe Wang. BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons. Neurips 2022.
An illustration of Random Normalization Aggregation and Black-box Adversarial Training:
Dataset used: [ImageNet2012]
├── BiMLP
├── Readme.md # descriptions about BiMLP # shell script for evaluation with GPU
├── src
│ ├──quan_conv.py # parameter configuration
│ ├──dataset.py # creating dataset
│ ├──wavemlp_20_3.py # Pruned ResNet architecture
├── eval.py # evaluation script
After installing MindSpore via the official website, you can start evaluation as follows:
# infer example
# python
GPU: python eval.py --dataset_path dataset --platform GPU --checkpoint_path [CHECKPOINT_PATH] --checkpoint_nm BiMLP_M
checkpoint can be produced in training process.
result: {'acc': 0.7155689820742638} ckpt= ./BiMLP_M.ckpt
Parameters | Ascend |
---|---|
Model Version | BiMLP_M |
Resource | GPU |
Uploaded Date | 26/11/2022 (month/day/year) |
MindSpore Version | 1.8.1 |
Dataset | ImageNet2012 |
batch_size | 64 |
outputs | probability |
Accuracy | 1pc: 71.56% |
In dataset.py, we set the seed inside “create_dataset" function. We also use random seed in train.py.
Please check the official homepage.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。