同步操作将从 OpenMMLab/mmtracking 强制同步,此操作会覆盖自 Fork 仓库以来所做的任何修改,且无法恢复!!!
确定后同步将在后台操作,完成时将刷新页面,请耐心等待。
There are two types of data pipelines in MMTracking:
For a single image, you may refer to the tutorial in MMDetection.
There are several differences in MMTracking:
VideoCollect
which is similar to Collect
in MMDetection but is more compatible with the video perception tasks. For example, the meta keys frame_id
and is_video_data
are collected by default.In some cases, we may need to process multiple images simultaneously. This is basically because we need to sample reference images of the key image in the same video to facilitate the training or inference process.
Please firstly take a look at the case of a single images above because the case of multiple images is heavily rely on it. We explain the details of the pipeline below.
We sample and load the annotations of the reference images once we get the annotations of the key image.
Take CocoVideoDataset
as an example, there is a function sample_ref_img
to sample and load the annotations of the reference images.
from mmdet.datasets import CocoDataset
class CocoVideoDataset(CocoDataset):
def __init__(self,
ref_img_sampler=None,
*args,
**kwargs):
super().__init__(*args, **kwargs)
self.ref_img_sampler = ref_img_sampler
def ref_img_sampling(self, **kwargs):
pass
def prepare_data(self, idx):
img_info = self.data_infos[idx]
if self.ref_img_sampler is not None:
img_infos = self.ref_img_sampling(img_info, **self.ref_img_sampler)
...
In this case, the loaded annotations is no longer a dict
but list[dict]
that contains the annotations for the key and reference images.
The first item of the list indicates the annotations of the key image.
In this step, we apply the transformations and then collected the information of the images.
In contrast to the pipeline of a single image that take a dictionary as the input and also output a dictionary for the next transformation, the sequential pipelines take a list of dictionaries as the input and also output a list of dictionaries for the next transformation.
These sequential pipelines are generally inherited from the pipeline in MMDetection but process the list in a loop.
from mmdet.datasets.builder import PIPELINES
from mmdet.datasets.pipelines import LoadImageFromFile
@PIPELINES.register_module()
class LoadMultiImagesFromFile(LoadImageFromFile):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
def __call__(self, results):
outs = []
for _results in results:
_results = super().__call__(_results)
outs.append(_results)
return outs
Sometimes you may need to add a parameter share_params
to decide whether share the random seed of the transformation on these images.
If there are more than one reference image, we implement ConcatVideoReferences
to collect the reference images to a dictionary.
The length of the list is 2 after the process.
In the end, we implement SeqDefaultFormatBundle
to convert the list to a dictionary as the input of the model forward.
Here is an example of the data pipeline:
train_pipeline = [
dict(type='LoadMultiImagesFromFile'),
dict(type='SeqLoadAnnotations', with_bbox=True, with_track=True),
dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True),
dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5),
dict(type='SeqNormalize', **img_norm_cfg),
dict(type='SeqPad', size_divisor=16),
dict(
type='VideoCollect',
keys=['img', 'gt_bboxes', 'gt_labels', 'gt_instance_ids']),
dict(type='ConcatVideoReferences'),
dict(type='SeqDefaultFormatBundle', ref_prefix='ref')
]
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。