@inproceedings{feichtenhofer2019slowfast,
title={Slowfast networks for video recognition},
author={Feichtenhofer, Christoph and Fan, Haoqi and Malik, Jitendra and He, Kaiming},
booktitle={Proceedings of the IEEE international conference on computer vision},
pages={6202--6211},
year={2019}
}
config | resolution | gpus | backbone | pretrain | top1 acc | top5 acc | inference_time(video/s) | gpu_mem(M) | ckpt | log | json |
---|---|---|---|---|---|---|---|---|---|---|---|
slowonly_r50_4x16x1_256e_kinetics400_rgb | short-side 256 | 8x4 | ResNet50 | None | 72.76 | 90.51 | x | 3168 | ckpt | log | json |
slowonly_r50_video_4x16x1_256e_kinetics400_rgb | short-side 256 | 8 | ResNet50 | None | 72.11 | 90.32 | x | 3168 | ckpt | log | json |
slowonly_r50_8x8x1_256e_kinetics400_rgb | short-side 256 | 8x4 | ResNet50 | None | 74.42 | 91.49 | x | 5820 | ckpt | log | json |
slowonly_r50_4x16x1_256e_kinetics400_rgb | short-side 320 | 8x2 | ResNet50 | None | 73.02 | 90.77 | 4.0 (40x3 frames) | 3168 | ckpt | log | json |
slowonly_r50_8x8x1_256e_kinetics400_rgb | short-side 320 | 8x3 | ResNet50 | None | 74.93 | 91.92 | 2.3 (80x3 frames) | 5820 | ckpt | log | json |
slowonly_r50_4x16x1_256e_kinetics400_flow | short-side 320 | 8x2 | ResNet50 | ImageNet | 61.79 | 83.62 | x | 8450 | ckpt | log | json |
slowonly_r50_8x8x1_196e_kinetics400_flow | short-side 320 | 8x4 | ResNet50 | ImageNet | 65.76 | 86.25 | x | 8455 | ckpt | log | json |
In data benchmark, we compare two different data preprocessing methods: (1) Resize video to 340x256, (2) Resize the short edge of video to 320px, (3) Resize the short edge of video to 256px.
config | resolution | gpus | backbone | Input | pretrain | top1 acc | top5 acc | testing protocol | ckpt | log | json |
---|---|---|---|---|---|---|---|---|---|---|---|
slowonly_r50_randomresizedcrop_340x256_4x16x1_256e_kinetics400_rgb | 340x256 | 8x2 | ResNet50 | 4x16 | None | 71.61 | 90.05 | 10 clips x 3 crops | ckpt | log | json |
slowonly_r50_randomresizedcrop_320p_4x16x1_256e_kinetics400_rgb | short-side 320 | 8x2 | ResNet50 | 4x16 | None | 73.02 | 90.77 | 10 clips x 3 crops | ckpt | log | json |
slowonly_r50_randomresizedcrop_256p_4x16x1_256e_kinetics400_rgb | short-side 256 | 8x4 | ResNet50 | 4x16 | None | 72.76 | 90.51 | 10 clips x 3 crops | ckpt | log | json |
Notes:
For more details on data preparation, you can refer to Kinetics400 in Data Preparation.
You can use the following command to train a model.
python tools/train.py ${CONFIG_FILE} [optional arguments]
Example: train SlowOnly model on Kinetics-400 dataset in a deterministic option with periodic validation.
python tools/train.py configs/recognition/slowonly/slowonly_r50_4x16x1_256e_kinetics400_rgb.py \
--work-dir work_dirs/slowonly_r50_4x16x1_256e_kinetics400_rgb \
--validate --seed 0 --deterministic
For more details, you can refer to Training setting part in getting_started.
You can use the following command to test a model.
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [optional arguments]
Example: test SlowOnly model on Kinetics-400 dataset and dump the result to a json file.
python tools/test.py configs/recognition/slowonly/slowonly_r50_4x16x1_256e_kinetics400_rgb.py \
checkpoints/SOME_CHECKPOINT.pth --eval top_k_accuracy mean_class_accuracy \
--out result.json --average-clips=prob
For more details, you can refer to Test a dataset part in getting_started.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。