# mae_mindspore
**Repository Path**: Lin-Bert/mae_mindspore
## Basic Information
- **Project Name**: mae_mindspore
- **Description**: MAE Vit Model For Mindspore
- **Primary Language**: Python
- **License**: Apache-2.0
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 3
- **Forks**: 2
- **Created**: 2022-02-11
- **Last Updated**: 2025-04-30
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
## Masked Autoencoders: A MindSpore Implementation
This is a MindSpore/NPU re-implementation of the paper [Masked Autoencoders Are Scalable Vision Learners](https://arxiv.org/abs/2111.06377):
```
@Article{MaskedAutoencoders2021,
author = {Kaiming He and Xinlei Chen and Saining Xie and Yanghao Li and Piotr Doll{\'a}r and Ross Girshick},
journal = {arXiv:2111.06377},
title = {Masked Autoencoders Are Scalable Vision Learners},
year = {2021},
}
```
* The original implementation was in PyTorch+GPU. This re-implementation is in MindSpore/NPU.
* This repo is a modification on the [mae](https://github.com/facebookresearch/mae). Installation and preparation follow that repo.
* This repo is based on [`mindspore==1.6.0`](https://www.mindspore.cn/install).
### Catalog
- [x] Pre-trained checkpoints + fine-tuning code
- [x] Pre-training code
### Pre-training
```shell
# 生成分布式训练所需的RANK_TABLE_FILE
python hccl_tools.py --device_num [0,8]
```
```shell
# start_lr = base_lr(1e-3) * device_num * batch_size / 256 超参配置详见论文或官方开源代码
# 单机训练
python pretrain.py --config [CONFIG_PATH] --use_parallel False > train.log 2>&1 &
# 分布式训练
cd scripts;
sh pretrain_dist.sh [RANK_TABLE_FILE] [CONFIG_PATH]
```
### Fine-tuning with pre-trained checkpoints
```shell
# start_lr = base_lr(5e-4) * device_num * batch_size / 256 超参配置详见论文或官方开源代码
# 单机训练
python finetune.py --config [CONFIG_PATH] --use_parallel False > finetune.log 2>&1 &
# 分布式训练
cd scripts;
sh finetune_dist.sh [RANK_TABLE_FILE] [CONFIG_PATH]
```
### Eval
```shell
# 分布式推理
cd scripts;
sh eval_dist.sh [RANK_TABLE_FILE] [CONFIG_PATH]
```
By fine-tuning these pre-trained models, we rank #1 in these classification tasks (detailed in the paper):
* 说明:使用ImageNet1K数据集预训练,finetune后acc@top1:0.801
|
ViT-B |
ImageNet-1K (no external data) |
80.1 |
### License
This project is under the CC-BY-NC 4.0 license. See [LICENSE](LICENSE) for details.