Practical implementation of an astoundingly simple method for self-supervised learning that achieves a new state of the art (surpassing SimCLR) without contrastive learning and having to designate negative pairs.
This repository offers a module that one can easily wrap any image-based neural network (residual network, discriminator, policy network) to immediately start benefitting from unlabelled image data.
$ pip install byol-pytorch
Simply plugin your neural network, specifying (1) the image dimensions as well as (2) the name (or index) of the hidden layer, whose output is used as the latent representation used for self-supervised training.
import torch
from byol_pytorch import BYOL
from torchvision import models
resnet = models.resnet50(pretrained=True)
learner = BYOL(
resnet,
image_size = 256,
hidden_layer = 'avgpool'
)
opt = torch.optim.Adam(learner.parameters(), lr=3e-4)
def sample_unlabelled_images():
return torch.randn(20, 3, 256, 256)
for _ in range(100):
images = sample_unlabelled_images()
loss = learner(images)
opt.zero_grad()
loss.backward()
opt.step()
learner.update_moving_average() # update moving average of target encoder
# save your improved network
torch.save(resnet.state_dict(), './improved-net.pt')
That's pretty much it. After much training, the residual network should now perform better on its downstream supervised tasks.
While the hyperparameters have already been set to what the paper has found optimal, you can change them with extra keyword arguments to the base wrapper class.
learner = BYOL(
resnet,
image_size = 256,
hidden_layer = 'avgpool',
projection_size = 256, # the projection size
projection_hidden_size = 4096, # the hidden dimension of the MLP for both the projection and prediction
moving_average_decay = 0.99 # the moving average decay factor for the target encoder, already set at what paper recommends
)
By default, this library will use the augmentations from the SimCLR paper (which is also used in the BYOL paper). However, if you would like to specify your own augmentation pipeline, you can simply pass in your own custom augmentation function with the augment_fn
keyword.
Augmentations must work in the tensor space. kornia
library is highly recommended for this. If you decide to use torchvision augmentations, make sure the tensor is first converted to PIL .toPILImage()
, and then back to tensors .ToTensor()
augment_fn = nn.Sequential(
kornia.augmentations.RandomHorizontalFlip()
)
learner = BYOL(
resnet,
image_size = 256,
hidden_layer = -2,
augment_fn = augment_fn
)
@misc{grill2020bootstrap,
title = {Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning},
author = {Jean-Bastien Grill and Florian Strub and Florent Altché and Corentin Tallec and Pierre H. Richemond and Elena Buchatskaya and Carl Doersch and Bernardo Avila Pires and Zhaohan Daniel Guo and Mohammad Gheshlaghi Azar and Bilal Piot and Koray Kavukcuoglu and Rémi Munos and Michal Valko},
year = {2020},
eprint = {2006.07733},
archivePrefix = {arXiv},
primaryClass = {cs.LG}
}
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。