# moco **Repository Path**: dabaier/moco ## Basic Information - **Project Name**: moco - **Description**: PyTorch implementation of MoCo: https://arxiv.org/abs/1911.05722 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2020-11-01 - **Last Updated**: 2024-10-22 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README ## MoCo: Momentum Contrast for Unsupervised Visual Representation Learning

This is a PyTorch implementation of the [MoCo paper](https://arxiv.org/abs/1911.05722): ``` @Article{he2019moco, author = {Kaiming He and Haoqi Fan and Yuxin Wu and Saining Xie and Ross Girshick}, title = {Momentum Contrast for Unsupervised Visual Representation Learning}, journal = {arXiv preprint arXiv:1911.05722}, year = {2019}, } ``` It also includes the implementation of the [MoCo v2 paper](https://arxiv.org/abs/2003.04297): ``` @Article{chen2020mocov2, author = {Xinlei Chen and Haoqi Fan and Ross Girshick and Kaiming He}, title = {Improved Baselines with Momentum Contrastive Learning}, journal = {arXiv preprint arXiv:2003.04297}, year = {2020}, } ``` ### Preparation Install PyTorch and ImageNet dataset following the [official PyTorch ImageNet training code](https://github.com/pytorch/examples/tree/master/imagenet). This repo aims to be minimal modifications on that code. Check the modifications by: ``` diff main_moco.py <(curl https://raw.githubusercontent.com/pytorch/examples/master/imagenet/main.py) diff main_lincls.py <(curl https://raw.githubusercontent.com/pytorch/examples/master/imagenet/main.py) ``` ### Unsupervised Training This implementation only supports **multi-gpu**, **DistributedDataParallel** training, which is faster and simpler; single-gpu or DataParallel training is not supported. To do unsupervised pre-training of a ResNet-50 model on ImageNet in an 8-gpu machine, run: ``` python main_moco.py \ -a resnet50 \ --lr 0.03 \ --batch-size 256 \ --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 \ [your imagenet-folder with train and val folders] ``` This script uses all the default hyper-parameters as described in the MoCo v1 paper. To run MoCo v2, set `--mlp --moco-t 0.2 --aug-plus --cos`. ***Note***: for 4-gpu training, we recommend following the [linear lr scaling recipe](https://arxiv.org/abs/1706.02677): `--lr 0.015 --batch-size 128` with 4 gpus. We got similar results using this setting. ### Linear Classification With a pre-trained model, to train a supervised linear classifier on frozen features/weights in an 8-gpu machine, run: ``` python main_lincls.py \ -a resnet50 \ --lr 30.0 \ --batch-size 256 \ --pretrained [your checkpoint path]/checkpoint_0199.pth.tar \ --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 \ [your imagenet-folder with train and val folders] ``` Linear classification results on ImageNet using this repo with 8 NVIDIA V100 GPUs :

	pre-train epochs	pre-train time	MoCo v1 top-1 acc.	MoCo v2 top-1 acc.
ResNet-50	200	53 hours	60.8±0.2	67.5±0.1

Here we run 5 trials (of pre-training and linear classification) and report mean±std: the 5 results of MoCo v1 are {60.6, 60.6, 60.7, 60.9, 61.1}, and of MoCo v2 are {67.7, 67.6, 67.4, 67.6, 67.3}. ### Models Our pre-trained ResNet-50 models can be downloaded as following:

	epochs	mlp	aug+	cos	top-1 acc.	model	md5
MoCo v1	200				60.6	download	`b251726a`
MoCo v2	200	✓	✓	✓	67.7	download	`59fd9945`
MoCo v2	800	✓	✓	✓	71.1	download	`a04e12f8`

### Transferring to Object Detection See [./detection](detection). ### License This project is under the CC-BY-NC 4.0 license. See [LICENSE](LICENSE) for details. ### See Also * [moco.tensorflow](https://github.com/ppwwyyxx/moco.tensorflow): A TensorFlow re-implementation. * [Colab notebook](https://colab.research.google.com/github/facebookresearch/moco/blob/colab-notebook/colab/moco_cifar10_demo.ipynb): CIFAR demo on Colab GPU.