English | 简体中文
v0.9.0 was released on 2023-10-10.
Highlights:
Support training with ColossalAI. Refer to the Training Large Models for more detailed usages.
Support gradient checkpointing. Refer to the Save Memory on GPU for more details.
Supports multiple visualization backends, including NeptuneVisBackend
, DVCLiveVisBackend
and AimVisBackend
. Refer to Visualization Backends for more details.
Read Changelog for more details.
MMEngine is a foundational library for training deep learning models based on PyTorch. It provides a solid engineering foundation and frees developers from writing redundant codes on workflows. It serves as the training engine of all OpenMMLab codebases, which support hundreds of algorithms in various research areas. Moreover, MMEngine is also generic to be applied to non-OpenMMLab projects.
Major features:
A universal and powerful runner:
Open architecture with unified interfaces:
Customizable training process:
Before installing MMEngine, please ensure that PyTorch has been successfully installed following the official guide.
Install MMEngine
pip install -U openmim
mim install mmengine
Verify the installation
python -c 'from mmengine.utils.dl_utils import collect_env;print(collect_env())'
Taking the training of a ResNet-50 model on the CIFAR-10 dataset as an example, we will use MMEngine to build a complete, configurable training and validation process in less than 80 lines of code.
First, we need to define a model which 1) inherits from BaseModel
and 2) accepts an additional argument mode
in the forward
method, in addition to those arguments related to the dataset.
mode
is "loss", and the forward
method should return a dict
containing the key "loss".mode
is "predict", and the forward method should return results containing both predictions and labels.import torch.nn.functional as F
import torchvision
from mmengine.model import BaseModel
class MMResNet50(BaseModel):
def __init__(self):
super().__init__()
self.resnet = torchvision.models.resnet50()
def forward(self, imgs, labels, mode):
x = self.resnet(imgs)
if mode == 'loss':
return {'loss': F.cross_entropy(x, labels)}
elif mode == 'predict':
return x, labels
Next, we need to create Datasets and DataLoaders for training and validation. In this case, we simply use built-in datasets supported in TorchVision.
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
norm_cfg = dict(mean=[0.491, 0.482, 0.447], std=[0.202, 0.199, 0.201])
train_dataloader = DataLoader(batch_size=32,
shuffle=True,
dataset=torchvision.datasets.CIFAR10(
'data/cifar10',
train=True,
download=True,
transform=transforms.Compose([
transforms.RandomCrop(32, padding=4),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(**norm_cfg)
])))
val_dataloader = DataLoader(batch_size=32,
shuffle=False,
dataset=torchvision.datasets.CIFAR10(
'data/cifar10',
train=False,
download=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize(**norm_cfg)
])))
To validate and test the model, we need to define a Metric called accuracy to evaluate the model. This metric needs to inherit from BaseMetric
and implements the process
and compute_metrics
methods.
from mmengine.evaluator import BaseMetric
class Accuracy(BaseMetric):
def process(self, data_batch, data_samples):
score, gt = data_samples
# Save the results of a batch to `self.results`
self.results.append({
'batch_size': len(gt),
'correct': (score.argmax(dim=1) == gt).sum().cpu(),
})
def compute_metrics(self, results):
total_correct = sum(item['correct'] for item in results)
total_size = sum(item['batch_size'] for item in results)
# Returns a dictionary with the results of the evaluated metrics,
# where the key is the name of the metric
return dict(accuracy=100 * total_correct / total_size)
Finally, we can construct a Runner with previously defined Model
, DataLoader
, and Metrics
, with some other configs, as shown below.
from torch.optim import SGD
from mmengine.runner import Runner
runner = Runner(
model=MMResNet50(),
work_dir='./work_dir',
train_dataloader=train_dataloader,
# a wrapper to execute back propagation and gradient update, etc.
optim_wrapper=dict(optimizer=dict(type=SGD, lr=0.001, momentum=0.9)),
# set some training configs like epochs
train_cfg=dict(by_epoch=True, max_epochs=5, val_interval=1),
val_dataloader=val_dataloader,
val_cfg=dict(),
val_evaluator=dict(type=Accuracy),
)
runner.train()
We appreciate all contributions to improve MMEngine. Please refer to CONTRIBUTING.md for the contributing guideline.
If you find this project useful in your research, please consider cite:
@article{mmengine2022,
title = {{MMEngine}: OpenMMLab Foundational Library for Training Deep Learning Models},
author = {MMEngine Contributors},
howpublished = {\url{https://github.com/open-mmlab/mmengine}},
year={2022}
}
This project is released under the Apache 2.0 license.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。