# hivit
**Repository Path**: diidid/hivit
## Basic Information
- **Project Name**: hivit
- **Description**: No description available
- **Primary Language**: Unknown
- **License**: MIT
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2025-02-10
- **Last Updated**: 2025-02-10
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# HiViT (ICLR2023, notable-top-25%)
This is the official implementation of the paper [HiViT: A Simple and More Efficient Design of Hierarchical Vision Transformer](https://arxiv.org/abs/2205.14949).
## Results
| Model | Pretraining data | ImageNet-1K | COCO Det | ADE Seg |
| ----------- | ---------------- | :---------: | :------: | :-----: |
| MAE-base | ImageNet-1K | 83.6 | 51.2 | 48.1 |
| SimMIM-base | ImageNet-1K | 84.0 | 52.3 | 52.8 |
| HiViT-base | ImageNet-1K | 84.6 | 53.3 | 52.8 |
## Pre-training Models
[mae_hivit_base_1600ep.pth](https://drive.google.com/file/d/1VZQz4buhlepZ5akTcEvrA3a_nxsQZ8eQ/view?usp=share_link)
[mae_hivit_base_1600ep_ft100ep.pth](https://drive.google.com/file/d/1TVfocCnoJj-SB7to6UQFvrB2205u2Q59/view?usp=share_link)
## Usage
> **1. Supervised learning on ImageNet-1K.**: See [supervised/get_started.md](supervised/get_started.md) for a quick start.
> **2. Self-supervised learning on ImageNet-1K.**: See [self_supervised/get_started.md](self_supervised/get_started.md).
> **3. Object detection**: See [detection/get_started.md](detection/get_started.md).
> **4. Semantic segmentation**: See [segmentation/get_started.md](segmentation/get_started.md).
## Bibtex
Please consider citing our paper in your publications if the project helps your research.
```bibtex
@inproceedings{zhanghivit,
title={HiViT: A Simpler and More Efficient Design of Hierarchical Vision Transformer},
author={Zhang, Xiaosong and Tian, Yunjie and Xie, Lingxi and Huang, Wei and Dai, Qi and Ye, Qixiang and Tian, Qi},
booktitle={International Conference on Learning Representations},
year={2023},
}
```