# mae_mindspore **Repository Path**: Lin-Bert/mae_mindspore ## Basic Information - **Project Name**: mae_mindspore - **Description**: MAE Vit Model For Mindspore - **Primary Language**: Python - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 3 - **Forks**: 2 - **Created**: 2022-02-11 - **Last Updated**: 2025-04-30 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ## Masked Autoencoders: A MindSpore Implementation

This is a MindSpore/NPU re-implementation of the paper [Masked Autoencoders Are Scalable Vision Learners](https://arxiv.org/abs/2111.06377): ``` @Article{MaskedAutoencoders2021, author = {Kaiming He and Xinlei Chen and Saining Xie and Yanghao Li and Piotr Doll{\'a}r and Ross Girshick}, journal = {arXiv:2111.06377}, title = {Masked Autoencoders Are Scalable Vision Learners}, year = {2021}, } ``` * The original implementation was in PyTorch+GPU. This re-implementation is in MindSpore/NPU. * This repo is a modification on the [mae](https://github.com/facebookresearch/mae). Installation and preparation follow that repo. * This repo is based on [`mindspore==1.6.0`](https://www.mindspore.cn/install). ### Catalog - [x] Pre-trained checkpoints + fine-tuning code - [x] Pre-training code ### Pre-training ```shell # 生成分布式训练所需的RANK_TABLE_FILE python hccl_tools.py --device_num [0,8] ``` ```shell # start_lr = base_lr(1e-3) * device_num * batch_size / 256 超参配置详见论文或官方开源代码 # 单机训练 python pretrain.py --config [CONFIG_PATH] --use_parallel False > train.log 2>&1 & # 分布式训练 cd scripts; sh pretrain_dist.sh [RANK_TABLE_FILE] [CONFIG_PATH] ``` ### Fine-tuning with pre-trained checkpoints ```shell # start_lr = base_lr(5e-4) * device_num * batch_size / 256 超参配置详见论文或官方开源代码 # 单机训练 python finetune.py --config [CONFIG_PATH] --use_parallel False > finetune.log 2>&1 & # 分布式训练 cd scripts; sh finetune_dist.sh [RANK_TABLE_FILE] [CONFIG_PATH] ``` ### Eval ```shell # 分布式推理 cd scripts; sh eval_dist.sh [RANK_TABLE_FILE] [CONFIG_PATH] ``` By fine-tuning these pre-trained models, we rank #1 in these classification tasks (detailed in the paper): * 说明：使用ImageNet1K数据集预训练，finetune后acc@top1:0.801

	ViT-B
ImageNet-1K (no external data)	80.1

### License This project is under the CC-BY-NC 4.0 license. See [LICENSE](LICENSE) for details.