# MindSpore Vision Transformer **Repository Path**: tianyu__zhou/mindspore-vision-transformer ## Basic Information - **Project Name**: MindSpore Vision Transformer - **Description**: No description available - **Primary Language**: Python - **License**: MIT - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 5 - **Forks**: 0 - **Created**: 2021-11-23 - **Last Updated**: 2022-07-25 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # MindSpore Vision Transformer #### Model Architecture ![输入图片说明](pics/vit_arch.PNG) **AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE** https://arxiv.org/pdf/2010.11929.pdf #### Environment MindSpore 1.3 https://www.mindspore.cn/tutorials/en/r1.3/index.html Training image (version 21.0.2-{ubuntu18.04, centos7.6}) https://ascendhub.huawei.com/#/detail/ascend-mindspore #### Benchmark | Dataset | No. NPU | Batch Size | Embedding Size | No. Encoders | No. Params | Training FPS | |---|---|---|---|---|---|---| | Cifar10 |1 |128 | 128 | 2 |206858 | 139 | | Cifar10 |1 |128 | 768| 12 |42605578| 9 | #### Progress - Model architecture is done - Training data pre-processing - Loss and optimizer definition - Single-NPU training #### TODO - Learning rate scheduler - Multi-NPU training