OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.

Overview

PyPI - Python Version PyPI docs badge codecov license

English | 简体中文

Documentation: https://mmtracking.readthedocs.io/

Introduction

MMTracking is an open source video perception toolbox based on PyTorch. It is a part of the OpenMMLab project.

The master branch works with PyTorch1.3+.

Major features

  • The First Unified Video Perception Platform

    We are the first open source toolbox that unifies versatile video perception tasks include video object detection, multiple object tracking, single object tracking and video instance segmentation.

  • Modular Design

    We decompose the video perception framework into different components and one can easily construct a customized method by combining different modules.

  • Simple, Fast and Strong

    Simple: MMTracking interacts with other OpenMMLab projects. It is built upon MMDetection that we can capitalize any detector only through modifying the configs.

    Fast: All operations run on GPUs. The training and inference speeds are faster than or comparable to other implementations.

    Strong: We reproduce state-of-the-art models and some of them even outperform the official implementations.

License

This project is released under the Apache 2.0 license.

Changelog

v0.8.0 was released in 03/10/2021. Please refer to changelog.md for details and release history.

Benchmark and model zoo

Results and models are available in the model zoo.

Supported methods of video object detection:

Supported methods of multi object tracking:

Supported methods of single object tracking:

Supported methods of video instance segmentation:

Installation

Please refer to install.md for install instructions.

Getting Started

Please see dataset.md and quick_run.md for the basic usage of MMTracking. We also provide usage tutorials, such as learning about configs, an example about detailed description of vid config, an example about detailed description of mot config, an example about detailed description of sot config, customizing dataset, customizing data pipeline, customizing vid model, customizing mot model, customizing sot model, customizing runtime settings and useful tools.

Contributing

We appreciate all contributions to improve MMTracking. Please refer to CONTRIBUTING.md for the contributing guideline.

Acknowledgement

MMTracking is an open source project that welcome any contribution and feedback. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible as well as standardized toolkit to reimplement existing methods and develop their own new video perception methods.

Citation

If you find this project useful in your research, please consider cite:

@misc{mmtrack2020,
    title={{MMTracking: OpenMMLab} video perception toolbox and benchmark},
    author={MMTracking Contributors},
    howpublished = {\url{https://github.com/open-mmlab/mmtracking}},
    year={2020}
}

Projects in OpenMMLab

  • MMCV: OpenMMLab foundational library for computer vision.
  • MIM: MIM Installs OpenMMLab Packages.
  • MMClassification: OpenMMLab image classification toolbox and benchmark.
  • MMDetection: OpenMMLab detection toolbox and benchmark.
  • MMDetection3D: OpenMMLab's next-generation platform for general 3D object detection.
  • MMSegmentation: OpenMMLab semantic segmentation toolbox and benchmark.
  • MMAction2: OpenMMLab's next-generation action understanding toolbox and benchmark.
  • MMTracking: OpenMMLab video perception toolbox and benchmark.
  • MMPose: OpenMMLab pose estimation toolbox and benchmark.
  • MMEditing: OpenMMLab image and video editing toolbox.
  • MMOCR: OpenMMLab text detection, recognition and understanding toolbox.
  • MMGeneration: OpenMMLab Generative Model toolbox and benchmark.
  • MMFlow: OpenMMLab optical flow toolbox and benchmark.
Comments
  • Parameters and variables setting in DFF model

    Parameters and variables setting in DFF model

    In "/mmtracking/mmtrack/models/motion/flownet_simple.py," the init parameters "flow_img_norm_std=[255.0, 255.0, 255.0]" and "flow_img_norm_mean=[0.411, 0.432, 0.450]" . What's the meaning of these parameters? I'm using a type of data with 10 channels, how should I set these parameters?

    Also, in "prepare_imgs" method, "img_metas[0]['img_norm_cfg']['mean']" and "img_metas[0]['img_norm_cfg']['std']" are both initialized with 0.Is it necessary to reassign the value while training or testing? If necessary, how and what value should I assign to these variables?

    opened by yan811 13
  • training MOT dataset

    training MOT dataset

    hello everyone. thank you for your answers in advance. i am new here, forgive me if i can't explain myself. I am trying to train the mot dataset. but i have a problem with pytorch. torch.distributed.launch is giving me error. i need to change to torhcrun (Transitioning from torch.distributed.launch to torchrun)but i couldn't modify the train.py script. can you please help me with that? thanks again.

    opened by mehmetcanmitil 9
  • 在使用多卡训练VID模型时,验证到最后几张图篇时发生卡顿(等了一个小时都没有更新)。

    在使用多卡训练VID模型时,验证到最后几张图篇时发生卡顿(等了一个小时都没有更新)。

    我的日志如下:

    2021-12-26 16:04:19,060 - mmtrack - INFO - Environment info:

    sys.platform: linux Python: 3.8.12 (default, Oct 12 2021, 13:49:34) [GCC 7.5.0] CUDA available: True GPU 0,1,2: GeForce RTX 2080 Ti CUDA_HOME: /usr/local/cuda NVCC: Cuda compilation tools, release 10.0, V10.0.130 GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609 PyTorch: 1.5.0 PyTorch compiling details: PyTorch built with:

    • GCC 7.3
    • C++ Version: 201402
    • Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
    • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
    • OpenMP 201511 (a.k.a. OpenMP 4.5)
    • NNPACK is enabled
    • CPU capability usage: AVX2
    • CUDA Runtime 10.1
    • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
    • CuDNN 7.6.3
    • Magma 2.5.2
    • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

    TorchVision: 0.6.0a0+82fd1c8 OpenCV: 4.5.4 MMCV: 1.4.1 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.1 MMTracking: 0.8.0+

    2021-12-26 16:04:19,061 - mmtrack - INFO - Distributed training: True 2021-12-26 16:04:19,761 - mmtrack - INFO - Config: model = dict( detector=dict( type='FasterRCNN', backbone=dict( type='ResNet', depth=50, num_stages=4, out_indices=(3, ), strides=(1, 2, 2, 1), dilations=(1, 1, 1, 2), frozen_stages=1, norm_cfg=dict(type='BN', requires_grad=True), norm_eval=True, style='pytorch'), neck=dict( type='ChannelMapper', in_channels=[2048], out_channels=512, kernel_size=3), rpn_head=dict( type='RPNHead', in_channels=512, feat_channels=512, anchor_generator=dict( type='AnchorGenerator', scales=[4, 8, 16, 32], ratios=[0.5, 1.0, 2.0], strides=[16]), bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[1.0, 1.0, 1.0, 1.0]), loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0), loss_bbox=dict( type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)), roi_head=dict( type='SelsaRoIHead', bbox_roi_extractor=dict( type='TemporalRoIAlign', roi_layer=dict( type='RoIAlign', output_size=7, sampling_ratio=2), out_channels=512, featmap_strides=[16], num_most_similar_points=2, num_temporal_attention_blocks=4), bbox_head=dict( type='SelsaBBoxHead', in_channels=512, fc_out_channels=1024, roi_feat_size=7, num_classes=30, bbox_coder=dict( type='DeltaXYWHBBoxCoder', target_means=[0.0, 0.0, 0.0, 0.0], target_stds=[0.2, 0.2, 0.2, 0.2]), reg_class_agnostic=False, loss_cls=dict( type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0), loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0), num_shared_fcs=3, aggregator=dict( type='SelsaAggregator', in_channels=1024, num_attention_blocks=16))), train_cfg=dict( rpn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.7, neg_iou_thr=0.3, min_pos_iou=0.3, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.5, neg_pos_ub=-1, add_gt_as_proposals=False), allowed_border=0, pos_weight=-1, debug=False), rpn_proposal=dict( nms_pre=6000, max_per_img=600, nms=dict(type='nms', iou_threshold=0.7), min_bbox_size=0), rcnn=dict( assigner=dict( type='MaxIoUAssigner', pos_iou_thr=0.5, neg_iou_thr=0.5, min_pos_iou=0.5, ignore_iof_thr=-1), sampler=dict( type='RandomSampler', num=256, pos_fraction=0.25, neg_pos_ub=-1, add_gt_as_proposals=True), pos_weight=-1, debug=False)), test_cfg=dict( rpn=dict( nms_pre=6000, max_per_img=300, nms=dict(type='nms', iou_threshold=0.7), min_bbox_size=0), rcnn=dict( score_thr=0.0001, nms=dict(type='nms', iou_threshold=0.5), max_per_img=100))), type='SELSA') dataset_type = 'ImagenetVIDDataset' data_root = 'data/FALD_VID/' img_norm_cfg = dict( mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) train_pipeline = [ dict(type='LoadMultiImagesFromFile'), dict(type='SeqLoadAnnotations', with_bbox=True, with_track=True), dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=16), dict( type='VideoCollect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_instance_ids']), dict(type='ConcatVideoReferences'), dict(type='SeqDefaultFormatBundle', ref_prefix='ref') ] test_pipeline = [ dict(type='LoadMultiImagesFromFile'), dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.0), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=16), dict( type='VideoCollect', keys=['img'], meta_keys=('num_left_ref_imgs', 'frame_stride')), dict(type='ConcatVideoReferences'), dict(type='MultiImagesToTensor', ref_prefix='ref'), dict(type='ToList') ] data = dict( samples_per_gpu=1, workers_per_gpu=2, train=dict( type='ImagenetVIDDataset', ann_file= 'data/FALD_VID/COCOVIDannotations/imagenet_vid_train_every10frames.json', img_prefix='data/FALD_VID/Data/VID', ref_img_sampler=dict( num_ref_imgs=2, frame_range=9, filter_key_img=False, method='bilateral_uniform'), pipeline=[ dict(type='LoadMultiImagesFromFile'), dict(type='SeqLoadAnnotations', with_bbox=True, with_track=True), dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=16), dict( type='VideoCollect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_instance_ids']), dict(type='ConcatVideoReferences'), dict(type='SeqDefaultFormatBundle', ref_prefix='ref') ]), val=dict( type='ImagenetVIDDataset', ann_file='data/FALD_VID/annotations/imagenet_vid_val.json', img_prefix='data/FALD_VID/Data/VID', ref_img_sampler=dict( num_ref_imgs=14, frame_range=[-7, 7], method='test_with_adaptive_stride'), pipeline=[ dict(type='LoadMultiImagesFromFile'), dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.0), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=16), dict( type='VideoCollect', keys=['img'], meta_keys=('num_left_ref_imgs', 'frame_stride')), dict(type='ConcatVideoReferences'), dict(type='MultiImagesToTensor', ref_prefix='ref'), dict(type='ToList') ], test_mode=True), test=dict( type='ImagenetVIDDataset', ann_file='data/FALD_VID/annotations/imagenet_vid_val.json', img_prefix='data/FALD_VID/Data/VID', ref_img_sampler=dict( num_ref_imgs=14, frame_range=[-7, 7], method='test_with_adaptive_stride'), pipeline=[ dict(type='LoadMultiImagesFromFile'), dict(type='SeqResize', img_scale=(1000, 600), keep_ratio=True), dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.0), dict( type='SeqNormalize', mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True), dict(type='SeqPad', size_divisor=16), dict( type='VideoCollect', keys=['img'], meta_keys=('num_left_ref_imgs', 'frame_stride')), dict(type='ConcatVideoReferences'), dict(type='MultiImagesToTensor', ref_prefix='ref'), dict(type='ToList') ], test_mode=True)) optimizer = dict(type='SGD', lr=0.005, momentum=0.9, weight_decay=0.0001) optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2)) checkpoint_config = dict(interval=1) log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')]) dist_params = dict(backend='nccl') log_level = 'INFO' load_from = None resume_from = None workflow = [('train', 1)] lr_config = dict( policy='step', warmup='linear', warmup_iters=500, warmup_ratio=0.3333333333333333, step=[2, 5]) total_epochs = 4 evaluation = dict(metric=['bbox'], interval=4) work_dir = './work_dirs/20211226_001_try3/' gpu_ids = range(0, 1)

    2021-12-26 16:04:24,438 - mmtrack - INFO - Set random seed to 2034425034, deterministic: False 2021-12-26 16:04:25,201 - mmtrack - INFO - initialize ResNet with init_cfg [{'type': 'Kaiming', 'layer': 'Conv2d'}, {'type': 'Constant', 'val': 1, 'layer': ['_BatchNorm', 'GroupNorm']}] 2021-12-26 16:04:25,466 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,467 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,468 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,470 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,471 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,472 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,473 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,475 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,477 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,479 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,481 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,482 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,484 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,490 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,496 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,500 - mmtrack - INFO - initialize Bottleneck with init_cfg {'type': 'Constant', 'val': 0, 'override': {'name': 'norm3'}} 2021-12-26 16:04:25,523 - mmtrack - INFO - initialize ChannelMapper with init_cfg {'type': 'Xavier', 'layer': 'Conv2d', 'distribution': 'uniform'} 2021-12-26 16:04:25,583 - mmtrack - INFO - initialize RPNHead with init_cfg {'type': 'Normal', 'layer': 'Conv2d', 'std': 0.01} 2021-12-26 16:04:25,637 - mmtrack - INFO - initialize SelsaBBoxHead with init_cfg [{'type': 'Normal', 'std': 0.01, 'override': {'name': 'fc_cls'}}, {'type': 'Normal', 'std': 0.001, 'override': {'name': 'fc_reg'}}, {'type': 'Xavier', 'distribution': 'uniform', 'override': [{'name': 'shared_fcs'}, {'name': 'cls_fcs'}, {'name': 'reg_fcs'}]}] Name of parameter - Initialization information

    detector.backbone.conv1.weight - torch.Size([64, 3, 7, 7]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.bn1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.bn1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.conv1.weight - torch.Size([64, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.0.bn1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.bn1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.conv2.weight - torch.Size([64, 64, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.0.bn2.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.bn2.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.conv3.weight - torch.Size([256, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.0.bn3.weight - torch.Size([256]): ConstantInit: val=0, bias=0

    detector.backbone.layer1.0.bn3.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.downsample.0.weight - torch.Size([256, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.0.downsample.1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.0.downsample.1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.1.conv1.weight - torch.Size([64, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.1.bn1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.1.bn1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.1.conv2.weight - torch.Size([64, 64, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.1.bn2.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.1.bn2.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.1.conv3.weight - torch.Size([256, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.1.bn3.weight - torch.Size([256]): ConstantInit: val=0, bias=0

    detector.backbone.layer1.1.bn3.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.2.conv1.weight - torch.Size([64, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.2.bn1.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.2.bn1.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.2.conv2.weight - torch.Size([64, 64, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.2.bn2.weight - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.2.bn2.bias - torch.Size([64]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer1.2.conv3.weight - torch.Size([256, 64, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer1.2.bn3.weight - torch.Size([256]): ConstantInit: val=0, bias=0

    detector.backbone.layer1.2.bn3.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.conv1.weight - torch.Size([128, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.0.bn1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.bn1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.conv2.weight - torch.Size([128, 128, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.0.bn2.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.bn2.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.conv3.weight - torch.Size([512, 128, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.0.bn3.weight - torch.Size([512]): ConstantInit: val=0, bias=0

    detector.backbone.layer2.0.bn3.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.downsample.0.weight - torch.Size([512, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.0.downsample.1.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.0.downsample.1.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.1.conv1.weight - torch.Size([128, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.1.bn1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.1.bn1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.1.conv2.weight - torch.Size([128, 128, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.1.bn2.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.1.bn2.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.1.conv3.weight - torch.Size([512, 128, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.1.bn3.weight - torch.Size([512]): ConstantInit: val=0, bias=0

    detector.backbone.layer2.1.bn3.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.2.conv1.weight - torch.Size([128, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.2.bn1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.2.bn1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.2.conv2.weight - torch.Size([128, 128, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.2.bn2.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.2.bn2.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.2.conv3.weight - torch.Size([512, 128, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.2.bn3.weight - torch.Size([512]): ConstantInit: val=0, bias=0

    detector.backbone.layer2.2.bn3.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.3.conv1.weight - torch.Size([128, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.3.bn1.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.3.bn1.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.3.conv2.weight - torch.Size([128, 128, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.3.bn2.weight - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.3.bn2.bias - torch.Size([128]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer2.3.conv3.weight - torch.Size([512, 128, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer2.3.bn3.weight - torch.Size([512]): ConstantInit: val=0, bias=0

    detector.backbone.layer2.3.bn3.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.conv1.weight - torch.Size([256, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.0.bn1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.bn1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.conv2.weight - torch.Size([256, 256, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.0.bn2.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.bn2.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.conv3.weight - torch.Size([1024, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.0.bn3.weight - torch.Size([1024]): ConstantInit: val=0, bias=0

    detector.backbone.layer3.0.bn3.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.downsample.0.weight - torch.Size([1024, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.0.downsample.1.weight - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.0.downsample.1.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.1.conv1.weight - torch.Size([256, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.1.bn1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.1.bn1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.1.conv2.weight - torch.Size([256, 256, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.1.bn2.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.1.bn2.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.1.conv3.weight - torch.Size([1024, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.1.bn3.weight - torch.Size([1024]): ConstantInit: val=0, bias=0

    detector.backbone.layer3.1.bn3.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.2.conv1.weight - torch.Size([256, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.2.bn1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.2.bn1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.2.conv2.weight - torch.Size([256, 256, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.2.bn2.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.2.bn2.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.2.conv3.weight - torch.Size([1024, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.2.bn3.weight - torch.Size([1024]): ConstantInit: val=0, bias=0

    detector.backbone.layer3.2.bn3.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.3.conv1.weight - torch.Size([256, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.3.bn1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.3.bn1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.3.conv2.weight - torch.Size([256, 256, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.3.bn2.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.3.bn2.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.3.conv3.weight - torch.Size([1024, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.3.bn3.weight - torch.Size([1024]): ConstantInit: val=0, bias=0

    detector.backbone.layer3.3.bn3.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.4.conv1.weight - torch.Size([256, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.4.bn1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.4.bn1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.4.conv2.weight - torch.Size([256, 256, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.4.bn2.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.4.bn2.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.4.conv3.weight - torch.Size([1024, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.4.bn3.weight - torch.Size([1024]): ConstantInit: val=0, bias=0

    detector.backbone.layer3.4.bn3.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.5.conv1.weight - torch.Size([256, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.5.bn1.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.5.bn1.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.5.conv2.weight - torch.Size([256, 256, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.5.bn2.weight - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.5.bn2.bias - torch.Size([256]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer3.5.conv3.weight - torch.Size([1024, 256, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer3.5.bn3.weight - torch.Size([1024]): ConstantInit: val=0, bias=0

    detector.backbone.layer3.5.bn3.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.conv1.weight - torch.Size([512, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.0.bn1.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.bn1.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.conv2.weight - torch.Size([512, 512, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.0.bn2.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.bn2.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.conv3.weight - torch.Size([2048, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.0.bn3.weight - torch.Size([2048]): ConstantInit: val=0, bias=0

    detector.backbone.layer4.0.bn3.bias - torch.Size([2048]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.downsample.0.weight - torch.Size([2048, 1024, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.0.downsample.1.weight - torch.Size([2048]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.0.downsample.1.bias - torch.Size([2048]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.1.conv1.weight - torch.Size([512, 2048, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.1.bn1.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.1.bn1.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.1.conv2.weight - torch.Size([512, 512, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.1.bn2.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.1.bn2.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.1.conv3.weight - torch.Size([2048, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.1.bn3.weight - torch.Size([2048]): ConstantInit: val=0, bias=0

    detector.backbone.layer4.1.bn3.bias - torch.Size([2048]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.2.conv1.weight - torch.Size([512, 2048, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.2.bn1.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.2.bn1.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.2.conv2.weight - torch.Size([512, 512, 3, 3]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.2.bn2.weight - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.2.bn2.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.backbone.layer4.2.conv3.weight - torch.Size([2048, 512, 1, 1]): KaimingInit: a=0, mode=fan_out, nonlinearity=relu, distribution =normal, bias=0

    detector.backbone.layer4.2.bn3.weight - torch.Size([2048]): ConstantInit: val=0, bias=0

    detector.backbone.layer4.2.bn3.bias - torch.Size([2048]): The value is the same before and after calling init_weights of SELSA

    detector.neck.convs.0.conv.weight - torch.Size([512, 2048, 3, 3]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.neck.convs.0.conv.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.rpn_head.rpn_conv.weight - torch.Size([512, 512, 3, 3]): NormalInit: mean=0, std=0.01, bias=0

    detector.rpn_head.rpn_conv.bias - torch.Size([512]): NormalInit: mean=0, std=0.01, bias=0

    detector.rpn_head.rpn_cls.weight - torch.Size([12, 512, 1, 1]): NormalInit: mean=0, std=0.01, bias=0

    detector.rpn_head.rpn_cls.bias - torch.Size([12]): NormalInit: mean=0, std=0.01, bias=0

    detector.rpn_head.rpn_reg.weight - torch.Size([48, 512, 1, 1]): NormalInit: mean=0, std=0.01, bias=0

    detector.rpn_head.rpn_reg.bias - torch.Size([48]): NormalInit: mean=0, std=0.01, bias=0

    detector.roi_head.bbox_roi_extractor.embed_network.conv.weight - torch.Size([512, 512, 3, 3]): Initialized by user-defined init_weights in ConvModule

    detector.roi_head.bbox_roi_extractor.embed_network.conv.bias - torch.Size([512]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.fc_cls.weight - torch.Size([31, 1024]): NormalInit: mean=0, std=0.01, bias=0

    detector.roi_head.bbox_head.fc_cls.bias - torch.Size([31]): NormalInit: mean=0, std=0.01, bias=0

    detector.roi_head.bbox_head.fc_reg.weight - torch.Size([120, 1024]): NormalInit: mean=0, std=0.001, bias=0

    detector.roi_head.bbox_head.fc_reg.bias - torch.Size([120]): NormalInit: mean=0, std=0.001, bias=0

    detector.roi_head.bbox_head.shared_fcs.0.weight - torch.Size([1024, 25088]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.roi_head.bbox_head.shared_fcs.0.bias - torch.Size([1024]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.roi_head.bbox_head.shared_fcs.1.weight - torch.Size([1024, 1024]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.roi_head.bbox_head.shared_fcs.1.bias - torch.Size([1024]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.roi_head.bbox_head.shared_fcs.2.weight - torch.Size([1024, 1024]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.roi_head.bbox_head.shared_fcs.2.bias - torch.Size([1024]): XavierInit: gain=1, distribution=uniform, bias=0

    detector.roi_head.bbox_head.aggregator.0.fc_embed.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.fc_embed.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.ref_fc_embed.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.ref_fc_embed.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.fc.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.fc.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.ref_fc.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.0.ref_fc.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.fc_embed.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.fc_embed.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.ref_fc_embed.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.ref_fc_embed.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.fc.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.fc.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.ref_fc.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.1.ref_fc.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.fc_embed.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.fc_embed.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.ref_fc_embed.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.ref_fc_embed.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.fc.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.fc.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.ref_fc.weight - torch.Size([1024, 1024]): The value is the same before and after calling init_weights of SELSA

    detector.roi_head.bbox_head.aggregator.2.ref_fc.bias - torch.Size([1024]): The value is the same before and after calling init_weights of SELSA
    2021-12-26 16:04:28,460 - mmtrack - INFO - Start running, host: [email protected], work_dir: /data/yangjiahui/VIDProject/mmtracking/work_dirs/20211226_001_try3 2021-12-26 16:04:28,461 - mmtrack - INFO - Hooks will be executed in the following order: before_run: (VERY_HIGH ) StepLrUpdaterHook
    (NORMAL ) CheckpointHook
    (NORMAL ) DistEvalHook
    (VERY_LOW ) TextLoggerHook

    before_train_epoch: (VERY_HIGH ) StepLrUpdaterHook
    (NORMAL ) DistSamplerSeedHook
    (NORMAL ) DistEvalHook
    (LOW ) IterTimerHook
    (VERY_LOW ) TextLoggerHook

    before_train_iter: (VERY_HIGH ) StepLrUpdaterHook
    (NORMAL ) DistEvalHook
    (LOW ) IterTimerHook

    after_train_iter: (ABOVE_NORMAL) OptimizerHook
    (NORMAL ) CheckpointHook
    (NORMAL ) DistEvalHook
    (LOW ) IterTimerHook
    (VERY_LOW ) TextLoggerHook

    after_train_epoch: (NORMAL ) CheckpointHook
    (NORMAL ) DistEvalHook
    (VERY_LOW ) TextLoggerHook

    before_val_epoch: (NORMAL ) DistSamplerSeedHook
    (LOW ) IterTimerHook
    (VERY_LOW ) TextLoggerHook

    before_val_iter: (LOW ) IterTimerHook

    after_val_iter: (LOW ) IterTimerHook

    after_val_epoch: (VERY_LOW ) TextLoggerHook

    after_run: (VERY_LOW ) TextLoggerHook

    2021-12-26 16:04:28,461 - mmtrack - INFO - workflow: [('train', 1)], max: 4 epochs 2021-12-26 16:04:28,461 - mmtrack - INFO - Checkpoints will be saved to /data/yangjiahui/VIDProject/mmtracking/work_dirs/20211226_001_try3 by HardDiskBackend. 2021-12-26 16:05:00,501 - mmtrack - INFO - Saving checkpoint at 1 epochs 2021-12-26 16:05:32,658 - mmtrack - INFO - Saving checkpoint at 2 epochs 2021-12-26 16:06:04,769 - mmtrack - INFO - Saving checkpoint at 3 epochs 2021-12-26 16:06:37,068 - mmtrack - INFO - Saving checkpoint at 4 epochs

    opened by FarranYang 9
  • Many errors when training ReID of Tractor model on MOT17.

    Many errors when training ReID of Tractor model on MOT17.

    I ran successfully the official tractor repo, but I cannot run this repo. Same command (default training command of Reid model), but different errors on different days.

    image image image

    Checklist

    1. I have searched related issues but cannot get the expected help.
    2. The bug has not been fixed in the latest version.

    Describe the bug A clear and concise description of what the bug is.

    Reproduction

    1. What command or script did you run?
    A placeholder for the command.
    
    1. Did you make any modifications on the code or config? Did you understand what you have modified?
    2. What dataset did you use and what task did you run?

    Environment

    1. Please run python mmtrack/utils/collect_env.py to collect necessary environment information and paste it here.
    2. You may add addition that may be helpful for locating the problem, such as
      • How you installed PyTorch [e.g., pip, conda, source]
      • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

    Error traceback If applicable, paste the error trackback here.

    A placeholder for trackback.
    

    Bug fix If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

    opened by sjtuytc 9
  • Person id changes when it's viewed from different camera using traktor

    Person id changes when it's viewed from different camera using traktor

    I was testing the traktor model on some videos and I found that when the camera changes,the id assigned to the object being tracked also changes. For example if we look at the last few seconds of the below inference video(obtained by running traktor code),it seems that the runners are now assigned a new ID. What could be the possible reason,is it because the same person is being viewed from a different camera angle.Or could it be that I need to retrain the re-id model dataset used was MOT 20,configuration was tracktor_faster-rcnn_r50_fpn_8e_mot20-public-half.

    Input video https://drive.google.com/file/d/1IVxcL3a5jUH3huJuyVzgDepIpBE62H3F/view?usp=sharing inference video https://drive.google.com/file/d/1Rcl3nrdTQznyPO4GQLm7_juYzsSZtLK4/view?usp=sharing

    opened by sparshgarg23 8
  • ReID training

    ReID training

    Thanks for your error report and we appreciate it a lot.

    Checklist

    1. I have searched related issues but cannot get the expected help.
    2. The bug has not been fixed in the latest version.

    Describe the bug A clear and concise description of what the bug is.

    Reproduction

    1. What command or script did you run?
    python3 ./tools/train.py configs/reid/resnet50_b32x8_MOT17.py --work-dir work_dirs/resnet50_b32x8_MOT17
    
    1. I did not make any modification on the code except dataset path
    2. Im running ReID training on MOT dataset

    Environment

    1. Please run python mmtrack/utils/collect_env.py to collect necessary environment information and paste it here. sys.platform: linux Python: 3.8.11 (default, Jul 3 2021, 17:53:42) [GCC 7.5.0] CUDA available: True GPU 0: TITAN Xp CUDA_HOME: None GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0 PyTorch: 1.7.1+cu101 PyTorch compiling details: PyTorch built with:
    • GCC 7.3
    • C++ Version: 201402
    • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
    • Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
    • OpenMP 201511 (a.k.a. OpenMP 4.5)
    • NNPACK is enabled
    • CPU capability usage: AVX2
    • CUDA Runtime 10.1
    • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75
    • CuDNN 7.6.3
    • Magma 2.5.2
    • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

    TorchVision: 0.8.2+cu101 OpenCV: 4.5.3 MMCV: 1.3.11 MMCV Compiler: GCC 7.3 MMCV CUDA Compiler: 10.1 MMTracking: 0.6.0+4d78b77

    1. You may add addition that may be helpful for locating the problem, such as
      • How you installed PyTorch [e.g., pip, conda, source]
      • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

    Error traceback If applicable, paste the error trackback here.

    sys.platform: linux
    Python: 3.8.11 (default, Jul  3 2021, 17:53:42) [GCC 7.5.0]
    CUDA available: True
    GPU 0: TITAN Xp
    CUDA_HOME: None
    GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
    PyTorch: 1.7.1+cu101
    PyTorch compiling details: PyTorch built with:
      - GCC 7.3
      - C++ Version: 201402
      - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
      - Intel(R) MKL-DNN v1.6.0 (Git Hash 5ef631a030a6f73131c77892041042805a06064f)
      - OpenMP 201511 (a.k.a. OpenMP 4.5)
      - NNPACK is enabled
      - CPU capability usage: AVX2
      - CUDA Runtime 10.1
      - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75
      - CuDNN 7.6.3
      - Magma 2.5.2
      - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 
    
    TorchVision: 0.8.2+cu101
    OpenCV: 4.5.3
    MMCV: 1.3.11
    MMCV Compiler: GCC 7.3
    MMCV CUDA Compiler: 10.1
    MMTracking: 0.6.0+4d78b77
    ------------------------------------------------------------
    
    2021-08-17 11:24:25,348 - mmtrack - INFO - Distributed training: False
    2021-08-17 11:24:26,303 - mmtrack - INFO - Config:
    dataset_type = 'ReIDDataset'
    img_norm_cfg = dict(
        mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
    train_pipeline = [
        dict(type='LoadMultiImagesFromFile', to_float32=True),
        dict(
            type='SeqResize',
            img_scale=(128, 256),
            share_params=False,
            keep_ratio=False,
            bbox_clip_border=False,
            override=False),
        dict(
            type='SeqRandomFlip',
            share_params=False,
            flip_ratio=0.5,
            direction='horizontal'),
        dict(
            type='SeqNormalize',
            mean=[123.675, 116.28, 103.53],
            std=[58.395, 57.12, 57.375],
            to_rgb=True),
        dict(type='VideoCollect', keys=['img', 'gt_label']),
        dict(type='ReIDFormatBundle')
    ]
    test_pipeline = [
        dict(type='LoadImageFromFile'),
        dict(type='Resize', img_scale=(128, 256), keep_ratio=False),
        dict(
            type='Normalize',
            mean=[123.675, 116.28, 103.53],
            std=[58.395, 57.12, 57.375],
            to_rgb=True),
        dict(type='ImageToTensor', keys=['img']),
        dict(type='Collect', keys=['img'], meta_keys=[])
    ]
    data_root = '/projects/datasets/MOT/MOT17/'
    data = dict(
        samples_per_gpu=2,
        workers_per_gpu=2,
        train=dict(
            type='ReIDDataset',
            triplet_sampler=dict(num_ids=8, ins_per_id=4),
            data_prefix='/projects/datasets/MOT/MOT17/reid/imgs',
            ann_file='/projects/datasets/MOT/MOT17/reid/meta/train_80.txt',
            pipeline=[
                dict(type='LoadMultiImagesFromFile', to_float32=True),
                dict(
                    type='SeqResize',
                    img_scale=(128, 256),
                    share_params=False,
                    keep_ratio=False,
                    bbox_clip_border=False,
                    override=False),
                dict(
                    type='SeqRandomFlip',
                    share_params=False,
                    flip_ratio=0.5,
                    direction='horizontal'),
                dict(
                    type='SeqNormalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='VideoCollect', keys=['img', 'gt_label']),
                dict(type='ReIDFormatBundle')
            ]),
        val=dict(
            type='ReIDDataset',
            triplet_sampler=None,
            data_prefix='/projects/datasets/MOT/MOT17/reid/imgs',
            ann_file='/projects/datasets/MOT/MOT17/reid/meta/val_20.txt',
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(type='Resize', img_scale=(128, 256), keep_ratio=False),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='ImageToTensor', keys=['img']),
                dict(type='Collect', keys=['img'], meta_keys=[])
            ]),
        test=dict(
            type='ReIDDataset',
            triplet_sampler=None,
            data_prefix='/projects/datasets/MOT/MOT17/reid/imgs',
            ann_file='/projects/datasets/MOT/MOT17/reid/meta/val_20.txt',
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(type='Resize', img_scale=(128, 256), keep_ratio=False),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='ImageToTensor', keys=['img']),
                dict(type='Collect', keys=['img'], meta_keys=[])
            ]))
    evaluation = dict(interval=1, metric='mAP')
    optimizer = dict(type='SGD', lr=0.0025, momentum=0.9, weight_decay=0.0001)
    optimizer_config = dict(grad_clip=None)
    checkpoint_config = dict(interval=1)
    log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
    dist_params = dict(backend='nccl')
    log_level = 'INFO'
    load_from = None
    resume_from = None
    workflow = [('train', 1)]
    USE_MMCLS = True
    model = dict(
        type='BaseReID',
        backbone=dict(
            type='ResNet',
            depth=50,
            num_stages=4,
            out_indices=(3, ),
            style='pytorch'),
        neck=dict(type='GlobalAveragePooling', kernel_size=(8, 4), stride=1),
        head=dict(
            type='LinearReIDHead',
            num_fcs=1,
            in_channels=2048,
            fc_channels=1024,
            out_channels=128,
            num_classes=378,
            loss=dict(type='CrossEntropyLoss', loss_weight=1.0),
            loss_pairwise=dict(type='TripletLoss', margin=0.3, loss_weight=1.0),
            norm_cfg=dict(type='BN1d'),
            act_cfg=dict(type='ReLU')),
        init_cfg=dict(
            type='Pretrained',
            checkpoint=
            'https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth'
        ))
    lr_config = dict(
        policy='step',
        warmup='linear',
        warmup_iters=1000,
        warmup_ratio=0.001,
        step=[5])
    total_epochs = 6
    work_dir = 'work_dirs/resnet50_b32x8_MOT17'
    gpu_ids = range(0, 1)
    
    2021-08-17 11:24:26,638 - mmtrack - INFO - initialize BaseReID with init_cfg {'type': 'Pretrained', 'checkpoint': 'https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth'}
    2021-08-17 11:24:26,638 - mmcv - INFO - load model from: https://download.openmmlab.com/mmclassification/v0/resnet/resnet50_batch256_imagenet_20200708-cfb998bf.pth
    2021-08-17 11:24:26,638 - mmcv - INFO - Use load_from_http loader
    2021-08-17 11:24:26,844 - mmcv - WARNING - The model and loaded state dict do not match exactly
    
    unexpected key in source state_dict: head.fc.weight, head.fc.bias
    
    missing keys in source state_dict: head.fcs.0.fc.weight, head.fcs.0.fc.bias, head.fcs.0.bn.weight, head.fcs.0.bn.bias, head.fcs.0.bn.running_mean, head.fcs.0.bn.running_var, head.fc_out.weight, head.fc_out.bias, head.bn.weight, head.bn.bias, head.bn.running_mean, head.bn.running_var, head.classifier.weight, head.classifier.bias
    
    2021-08-17 11:24:33,803 - mmtrack - INFO - Start running, host: [email protected], work_dir: /home2/qljx17/Open-MMLab/mmtracking/work_dirs/resnet50_b32x8_MOT17
    2021-08-17 11:24:33,803 - mmtrack - INFO - Hooks will be executed in the following order:
    before_run:
    (VERY_HIGH   ) StepLrUpdaterHook                  
    (NORMAL      ) CheckpointHook                     
    (NORMAL      ) EvalHook                           
    (VERY_LOW    ) TextLoggerHook                     
     -------------------- 
    before_train_epoch:
    (VERY_HIGH   ) StepLrUpdaterHook                  
    (NORMAL      ) EvalHook                           
    (LOW         ) IterTimerHook                      
    (VERY_LOW    ) TextLoggerHook                     
     -------------------- 
    before_train_iter:
    (VERY_HIGH   ) StepLrUpdaterHook                  
    (NORMAL      ) EvalHook                           
    (LOW         ) IterTimerHook                      
     -------------------- 
    after_train_iter:
    (ABOVE_NORMAL) OptimizerHook                      
    (NORMAL      ) CheckpointHook                     
    (NORMAL      ) EvalHook                           
    (LOW         ) IterTimerHook                      
    (VERY_LOW    ) TextLoggerHook                     
     -------------------- 
    after_train_epoch:
    (NORMAL      ) CheckpointHook                     
    (NORMAL      ) EvalHook                           
    (VERY_LOW    ) TextLoggerHook                     
     -------------------- 
    before_val_epoch:
    (LOW         ) IterTimerHook                      
    (VERY_LOW    ) TextLoggerHook                     
     -------------------- 
    before_val_iter:
    (LOW         ) IterTimerHook                      
     -------------------- 
    after_val_iter:
    (LOW         ) IterTimerHook                      
     -------------------- 
    after_val_epoch:
    (VERY_LOW    ) TextLoggerHook                     
     -------------------- 
    2021-08-17 11:24:33,803 - mmtrack - INFO - workflow: [('train', 1)], max: 6 epochs
    /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: ClassNLLCriterion_updateOutput_no_reduce_kernel: block: [0,0,0], thread: [44,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
    /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: ClassNLLCriterion_updateOutput_no_reduce_kernel: block: [0,0,0], thread: [45,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
    /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: ClassNLLCriterion_updateOutput_no_reduce_kernel: block: [0,0,0], thread: [46,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
    /pytorch/aten/src/THCUNN/ClassNLLCriterion.cu:59: ClassNLLCriterion_updateOutput_no_reduce_kernel: block: [0,0,0], thread: [47,0,0] Assertion `cur_target >= 0 && cur_target < n_classes` failed.
    Traceback (most recent call last):
      File "./tools/train.py", line 174, in <module>
        main()
      File "./tools/train.py", line 163, in main
        train_model(
      File "/home2/qljx17/Open-MMLab/mmtracking/mmtrack/apis/train.py", line 136, in train_model
        runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
      File "/home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 127, in run
        epoch_runner(data_loaders[i], **kwargs)
      File "/home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
        self.run_iter(data_batch, train_mode=True, **kwargs)
      File "/home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 29, in run_iter
        outputs = self.model.train_step(data_batch, self.optimizer,
      File "/home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
        return self.module.train_step(*inputs[0], **kwargs[0])
      File "/home2/qljx17/Open-MMLab/mmclassification/mmcls/models/classifiers/base.py", line 146, in train_step
        loss, log_vars = self._parse_losses(losses)
      File "/home2/qljx17/Open-MMLab/mmclassification/mmcls/models/classifiers/base.py", line 97, in _parse_losses
        log_vars[loss_name] = loss_value.mean()
    RuntimeError: CUDA error: device-side assert triggered
    terminate called after throwing an instance of 'c10::Error'
      what():  CUDA error: device-side assert triggered
    Exception raised from create_event_internal at /pytorch/c10/cuda/CUDACachingAllocator.cpp:687 (most recent call first):
    frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x42 (0x7fc1479138b2 in /home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/torch/lib/libc10.so)
    frame #1: c10::cuda::CUDACachingAllocator::raw_delete(void*) + 0xad2 (0x7fc147b65952 in /home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/torch/lib/libc10_cuda.so)
    frame #2: c10::TensorImpl::release_resources() + 0x4d (0x7fc1478feb7d in /home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/torch/lib/libc10.so)
    frame #3: <unknown function> + 0x5fd7a2 (0x7fc1920fb7a2 in /home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
    frame #4: <unknown function> + 0x5fd856 (0x7fc1920fb856 in /home2/qljx17/Open-MMLab/evenv/lib/python3.8/site-packages/torch/lib/libtorch_python.so)
    frame #5: python3() [0x534ce6]
    frame #6: python3() [0x51c5d9]
    frame #7: python3() [0x52cb15]
    frame #8: python3() [0x52cb15]
    frame #9: python3() [0x500a2e]
    frame #10: python3() [0x57d905]
    frame #11: python3() [0x57d8bb]
    frame #12: python3() [0x57d8bb]
    frame #13: python3() [0x57d8bb]
    frame #14: python3() [0x57d8bb]
    frame #15: python3() [0x57d8bb]
    frame #16: python3() [0x57d8bb]
    frame #17: python3() [0x5f25e6]
    <omitting python frames>
    frame #23: __libc_start_main + 0xf3 (0x7fc1a2ef10b3 in /lib/x86_64-linux-gnu/libc.so.6)
    
    /var/spool/slurmd/job128755/slurm_script: line 21: 3941330 Aborted                 (core dumped) python3 ./tools/train.py configs/reid/resnet50_b32x8_MOT17.py --work-dir work_dirs/resnet50_b32x8_MOT17
    ^Z
    

    Bug fix From the error above, I can assume that its because of the number of classes. From the default config, num of class is being set as 378, which is taken from train_80.txt, hence the error appear. However, when I set the num of class as 512, which is the number of samples in imgs folder, Im able to run the training without any error. Is there something that I missed, or the number of classes could be the main problem here?

    opened by yonafalinie 8
  • What is the difference between load_from and pretrain?

    What is the difference between load_from and pretrain?

    Hello~ Thanks a lot for your awesome job and I appreciate for your effort! However, I have some problems hoping you to help me solve it. When I use the default configure at configs/det/faster-rcnn_r50_fpn_4e_mot17-half.py to train faster-rcnn detector by MMtracking, I got Nan losses. But when I change the downloaded state dict, which is pretrained faster-rcnn on COCO dataset, from 'load_from' entry to 'pretrain' entry of detector, the Nan losses disappears. I wonder how this happen? What's the difference between 'load_from' and 'pretrain', since both of them seem not to strictly load parameters? Thanks a lot again!

    I check again, finding that the 'pretrain' entry for detection seems NOT load pretrain dicts as I expected, and directly train from randomly initiated parameters. So how to use the pretrained faster rcnn model dicts anyway?

    opened by gsygsy96 8
  • Problem met when running MOT demo

    Problem met when running MOT demo

    Hi, I met a problem when I run MOT demo. It said that "IndexError: tensors used as indices must be long, byte or bool tensors /cluster/home/it_stu12/.conda/envs/gjj/lib/python3.7/tempfile.py:798: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpd1jtqm1n'> _warnings.warn(warn_message, ResourceWarning)". Here's my error log.

    Error Log

    Traceback (most recent call last): File "demo/demo_mot.py", line 94, in main() File "demo/demo_mot.py", line 70, in main result = inference_mot(model, img, frame_id=i) File "/cluster/home/it_stu12/main/gjj/mmtracking/mmtrack/apis/inference.py", line 81, in inference_mot data = collate([data], samples_per_gpu=1) File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 81, in collate for key in batch[0] File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 81, in for key in batch[0] File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 77, in collate return [collate(samples, samples_per_gpu) for samples in transposed] File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 77, in return [collate(samples, samples_per_gpu) for samples in transposed] File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 81, in collate for key in batch[0] File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 81, in for key in batch[0] File "/cluster/home/it_stu12/main/gjj/mmcv/mmcv/parallel/collate.py", line 80, in key: collate([d[key] for d in batch], samples_per_gpu) IndexError: tensors used as indices must be long, byte or bool tensors /cluster/home/it_stu12/.conda/envs/gjj/lib/python3.7/tempfile.py:798: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpd1jtqm1n'> _warnings.warn(warn_message, ResourceWarning)

    Environment

    No CUDA runtime is found, using CUDA_HOME='/cluster/apps/cuda/10.1' sys.platform: linux Python: 3.7.10 (default, Jun 4 2021, 14:48:32) [GCC 7.5.0] CUDA available: False GCC: gcc (GCC) 5.4.0 PyTorch: 1.6.0 PyTorch compiling details: PyTorch built with:

    • GCC 7.3
    • C++ Version: 201402
    • Intel(R) Math Kernel Library Version 2019.0.5 Product Build 20190808 for Intel(R) 64 architecture applications
    • Intel(R) MKL-DNN v1.5.0 (Git Hash e2ac1fac44c5078ca927cb9b90e1b3066a0b2ed0)
    • OpenMP 201511 (a.k.a. OpenMP 4.5)
    • NNPACK is enabled
    • CPU capability usage: AVX2
    • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

    TorchVision: 0.7.0 OpenCV: 4.1.0 MMCV: 1.4.2 MMCV Compiler: GCC 5.4 MMCV CUDA Compiler: not available MMTracking: 0.8.0+603d6fe

    Some Other Problems

    • The doc of MMTracking 0.8 says that MMCV version should be mmcv-full>=1.3.8, <1.4.0. But when I install mmcv-full 1.3.9, it told me that my "mmcv-full is too old, please install mmcv-fulll >=1.3.16, <=1.5.0". Which one should I believe?
    • The Chinese version doc of MMTracking 0.8 gives a demo script python demo/demo_mot.py configs/mot/deepsort/sort_faster-rcnn_fpn_4e_mot17-private.py --input demo/demo.mp4 --output mot.mp4. But I didn't find demo_mot.py in folder demo but found demo_mot_vis.py. Maybe the Chinese doc should be updated?

    Thank you so much!

    opened by AndrewGuo0930 7
  • time estimation log export

    time estimation log export

    I found this library very helpful. Great work. I have to ask, Is it possible to keep export of the log while the video (time + weight of the tracking object) is running either live camera or playing recorded video? Please guide me if how to export that log using this library.

    opened by Tortoise17 7
  • pickle file has only det_bboxes

    pickle file has only det_bboxes

    Hello, I have tested on my custom dataset for VID and saved the results to a .pkl file. However, the pickle file seems to have only the det_bboxes and not the det_labels . Is there any way to add det_labels too? Any tips would be helpful!

    opened by godwinrayanc 6
  • How to select classes of which outputs from Detector model to be fed into reid model ?

    How to select classes of which outputs from Detector model to be fed into reid model ?

    I have trained the detector model in mmdetection with multiple classes , if i want to fed the "person" class outputs alone from the detector model to the reid model during inference , can i do that using config or any other method ?

    And also if i have to fed the mmdetection pretrained model into tracker , what are the config changes have to be done ?

    Thank you in advance

    opened by Balakumaran-kandula 6
  • I want to train the masktrackrcnn, but it occur :KeyError:

    I want to train the masktrackrcnn, but it occur :KeyError: "YouTubeVISDataset: 'image_id'"

    Hello! I want to train the masktrackrcnn by the official youtube_vis_dataset but it occur :KeyError: "YouTubeVISDataset: 'image_id'". Here is my datatree image

    Traceback (most recent call last):
      File "/home/music/miniconda3/envs/mmtrack/lib/python3.8/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg
        return obj_cls(**args)
      File "/home/music/Downloads/mmtracking-master/mmtrack/datasets/youtube_vis_dataset.py", line 44, in __init__
        super().__init__(*args, **kwargs)
      File "/home/music/Downloads/mmtracking-master/mmtrack/datasets/coco_video_dataset.py", line 46, in __init__
        super().__init__(*args, **kwargs)
      File "/home/music/miniconda3/envs/mmtrack/lib/python3.8/site-packages/mmdet/datasets/custom.py", line 97, in __init__
        self.data_infos = self.load_annotations(local_path)
      File "/home/music/Downloads/mmtracking-master/mmtrack/datasets/coco_video_dataset.py", line 61, in load_annotations
        data_infos = self.load_video_anns(ann_file)
      File "/home/music/Downloads/mmtracking-master/mmtrack/datasets/coco_video_dataset.py", line 73, in load_video_anns
        self.coco = CocoVID(ann_file)
      File "/home/music/Downloads/mmtracking-master/mmtrack/datasets/parsers/coco_video_parser.py", line 22, in __init__
        super(CocoVID, self).__init__(annotation_file=annotation_file)
      File "/home/music/miniconda3/envs/mmtrack/lib/python3.8/site-packages/mmdet/datasets/api_wrappers/coco_api.py", line 23, in __init__
        super().__init__(annotation_file=annotation_file)
      File "/home/music/miniconda3/envs/mmtrack/lib/python3.8/site-packages/pycocotools/coco.py", line 86, in __init__
        self.createIndex()
      File "/home/music/Downloads/mmtracking-master/mmtrack/datasets/parsers/coco_video_parser.py", line 57, in createIndex
        imgToAnns[ann['image_id']].append(ann)
    KeyError: 'image_id'
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "tools/train.py", line 213, in <module>
        main()
      File "tools/train.py", line 188, in main
        datasets = [build_dataset(cfg.data.train)]
      File "/home/music/miniconda3/envs/mmtrack/lib/python3.8/site-packages/mmdet/datasets/builder.py", line 82, in build_dataset
        dataset = build_from_cfg(cfg, DATASETS, default_args)
      File "/home/music/miniconda3/envs/mmtrack/lib/python3.8/site-packages/mmcv/utils/registry.py", line 72, in build_from_cfg
        raise type(e)(f'{obj_cls.__name__}: {e}')
    KeyError: "YouTubeVISDataset: 'image_id'"
    

    Thank you!

    opened by eatbreakfast111 2
  • IndexError: boolean index did not match indexed array along dimension 0; dimension is 10 but corresponding boolean dimension is 9

    IndexError: boolean index did not match indexed array along dimension 0; dimension is 10 but corresponding boolean dimension is 9

    Thanks for your error report and we appreciate it a lot.

    Checklist

    1. I have searched related issues but cannot get the expected help.
    2. The bug has not been fixed in the latest version.

    Describe the bug I was trying to run qdtrack model for MOT17 in dev-1.x, but it always got this error.

    Reproduction

    1. What command or script did you run?
    srun -p bigdata_s2 --quotatype=auto --gres=gpu:1 python tools/train.py configs/mot/qdtrack/qdtrack_faster-rcnn_r50_fpn_8xb2-4e_mot17halftrain_test-mot17halfval.py
    
    1. Did you make any modifications on the code or config? Did you understand what you have modified? No
    2. What dataset did you use and what task did you run? MOT17

    Environment

    1. Please run python mmtrack/utils/collect_env.py to collect necessary environment information and paste it here. Got error:
    Traceback (most recent call last):
      File "mmtrack/utils/collect_env.py", line 2, in <module>
        from mmcv.utils import collect_env as collect_base_env
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/__init__.py", line 3, in <module>
        from .arraymisc import *
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/arraymisc/__init__.py", line 2, in <module>
        from .quantization import dequantize, quantize
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/arraymisc/quantization.py", line 2, in <module>
        from typing import Union
      File "/mnt/petrelfs/ouyanglinke/mmtracking/mmtrack/utils/typing.py", line 3, in <module>
        from typing import Dict, List, Optional, Tuple, Union
    ImportError: cannot import name 'Dict' from partially initialized module 'typing' (most likely due to a circular import) (/mnt/petrelfs/ouyanglinke/mmtracking/mmtrack/utils/typing.py)
    

    But I do successfully run SOT model. Python version is 3.8. Pytorch is 1.7.1.

    1. You may add addition that may be helpful for locating the problem, such as
      • How you installed PyTorch [e.g., pip, conda, source]
      • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

    Error traceback If applicable, paste the error trackback here.

    Traceback (most recent call last):
      File "tools/train.py", line 119, in <module>
        main()
      File "tools/train.py", line 115, in main
        runner.train()
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/runner.py", line 1684, in train
        model = self.train_loop.run()  # type: ignore
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 90, in run
        self.run_epoch()
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/runner/loops.py", line 105, in run_epoch
        for idx, data_batch in enumerate(self.dataloader):
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
        data = self._next_data()
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
        return self._process_data(data)
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
        data.reraise()
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/_utils.py", line 428, in reraise
        raise self.exc_type(msg)
    IndexError: Caught IndexError in DataLoader worker process 0.
    Original Traceback (most recent call last):
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
        data = fetcher.fetch(index)
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
        data = [self.dataset[idx] for idx in possibly_batched_index]
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 378, in __getitem__
        data = self.prepare_data(idx)
      File "/mnt/petrelfs/ouyanglinke/mmtracking/mmtrack/datasets/base_video_dataset.py", line 387, in prepare_data
        return self.pipeline(final_data_info)
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmengine/dataset/base_dataset.py", line 55, in __call__
        data = t(data)
      File "/mnt/lustre/ouyanglinke/anaconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/transforms/base.py", line 12, in __call__
        return self.transform(results)
      File "/mnt/petrelfs/ouyanglinke/mmtracking/mmtrack/datasets/transforms/formatting.py", line 237, in transform
        key_anns[key_valid_idx])
    IndexError: boolean index did not match indexed array along dimension 0; dimension is 10 but corresponding boolean dimension is 9
    

    Bug fix If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated! This message might help: In mmtracking/configs/base/datasets/mot_challenge.py, just block "TransformBroadcaster" would work.

    # data pipeline
    train_pipeline = [
        dict(
            type='TransformBroadcaster',
            share_random_params=True,
            transforms=[
                dict(type='LoadImageFromFile'),
                dict(type='LoadTrackAnnotations', with_instance_id=True),
                dict(
                    type='mmdet.RandomResize',
                    scale=(1088, 1088),
                    ratio_range=(0.8, 1.2),
                    keep_ratio=True,
                    clip_object_border=False),
                dict(type='mmdet.PhotoMetricDistortion')
            ]),
        # dict(
        #     type='TransformBroadcaster',
        #     share_random_params=False,
        #     transforms=[
        #         dict(
        #             type='mmdet.RandomCrop',
        #             crop_size=(1088, 1088),
        #             bbox_clip_border=False)
        #     ]),
        dict(
            type='TransformBroadcaster',
            share_random_params=True,
            transforms=[
                dict(type='mmdet.RandomFlip', prob=0.5),
            ]),
        dict(type='PackTrackInputs', ref_prefix='ref', num_key_frames=1)
    ]
    
    opened by ouyanglinke 1
  • TypeError: forward_train() missing 4 required positional arguments: 'ref_img', 'ref_img_metas', 'ref_gt_bboxes', and 'ref_gt_labels'

    TypeError: forward_train() missing 4 required positional arguments: 'ref_img', 'ref_img_metas', 'ref_gt_bboxes', and 'ref_gt_labels'

    Hello, I want to train the masktrack_rcnn with the coco dataset. So i had reset the dataset of masktrack_rcnn_r50_fpn_12e_youtubevis2019.py------'../../_base_/datasets/coco_instance.py' and the num_classes=6. image

    By the way, i had reset the /home/music/Downloads/mmtracking-master/mmtrack/datasets/coco_video_dataset.py-------CLASSES = ('aircraft', 'buildings', 'electrical', 'person', 'tree', 'wire') and theload_as_video=False image

    And my env:

    ------------------------------------------------------------
    sys.platform: linux
    Python: 3.8.15 (default, Nov 24 2022, 15:19:38) [GCC 11.2.0]
    CUDA available: True
    GPU 0: NVIDIA GeForce RTX 3090 Ti
    CUDA_HOME: None
    GCC: gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
    PyTorch: 1.12.1
    PyTorch compiling details: PyTorch built with:
      - GCC 9.3
      - C++ Version: 201402
      - Intel(R) oneAPI Math Kernel Library Version 2021.4-Product Build 20210904 for Intel(R) 64 architecture applications
      - Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
      - OpenMP 201511 (a.k.a. OpenMP 4.5)
      - LAPACK is enabled (usually provided by MKL)
      - NNPACK is enabled
      - CPU capability usage: AVX2
      - CUDA Runtime 11.3
      - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
      - CuDNN 8.3.2  (built against CUDA 11.5)
      - Magma 2.5.2
      - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=/opt/rh/devtoolset-9/root/usr/bin/c++, CXX_FLAGS= -fabi-version=11 -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_ROCM=OFF, 
    
    TorchVision: 0.13.1
    OpenCV: 4.6.0
    MMCV: 1.7.0
    MMCV Compiler: GCC 9.3
    MMCV CUDA Compiler: 11.3
    MMTracking: 0.14.0+
    

    This my config:

    2022-12-16 10:09:31,585 - mmtrack - INFO - Distributed training: False
    2022-12-16 10:09:32,054 - mmtrack - INFO - Config:
    model = dict(
        detector=dict(
            type='MaskRCNN',
            backbone=dict(
                type='ResNet',
                depth=50,
                num_stages=4,
                out_indices=(0, 1, 2, 3),
                frozen_stages=1,
                norm_cfg=dict(type='BN', requires_grad=True),
                norm_eval=True,
                style='pytorch',
                init_cfg=dict(
                    type='Pretrained', checkpoint='torchvision://resnet50')),
            neck=dict(
                type='FPN',
                in_channels=[256, 512, 1024, 2048],
                out_channels=256,
                num_outs=5),
            rpn_head=dict(
                type='RPNHead',
                in_channels=256,
                feat_channels=256,
                anchor_generator=dict(
                    type='AnchorGenerator',
                    scales=[8],
                    ratios=[0.5, 1.0, 2.0],
                    strides=[4, 8, 16, 32, 64]),
                bbox_coder=dict(
                    type='DeltaXYWHBBoxCoder',
                    target_means=[0.0, 0.0, 0.0, 0.0],
                    target_stds=[1.0, 1.0, 1.0, 1.0]),
                loss_cls=dict(
                    type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
                loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
            roi_head=dict(
                type='StandardRoIHead',
                bbox_roi_extractor=dict(
                    type='SingleRoIExtractor',
                    roi_layer=dict(
                        type='RoIAlign', output_size=7, sampling_ratio=0),
                    out_channels=256,
                    featmap_strides=[4, 8, 16, 32]),
                bbox_head=dict(
                    type='Shared2FCBBoxHead',
                    in_channels=256,
                    fc_out_channels=1024,
                    roi_feat_size=7,
                    num_classes=6,
                    bbox_coder=dict(
                        type='DeltaXYWHBBoxCoder',
                        target_means=[0.0, 0.0, 0.0, 0.0],
                        target_stds=[0.1, 0.1, 0.2, 0.2]),
                    reg_class_agnostic=False,
                    loss_cls=dict(
                        type='CrossEntropyLoss',
                        use_sigmoid=False,
                        loss_weight=1.0),
                    loss_bbox=dict(type='L1Loss', loss_weight=1.0)),
                mask_roi_extractor=dict(
                    type='SingleRoIExtractor',
                    roi_layer=dict(
                        type='RoIAlign', output_size=14, sampling_ratio=0),
                    out_channels=256,
                    featmap_strides=[4, 8, 16, 32]),
                mask_head=dict(
                    type='FCNMaskHead',
                    num_convs=4,
                    in_channels=256,
                    conv_out_channels=256,
                    num_classes=6,
                    loss_mask=dict(
                        type='CrossEntropyLoss', use_mask=True, loss_weight=1.0))),
            train_cfg=dict(
                rpn=dict(
                    assigner=dict(
                        type='MaxIoUAssigner',
                        pos_iou_thr=0.7,
                        neg_iou_thr=0.3,
                        min_pos_iou=0.3,
                        match_low_quality=True,
                        ignore_iof_thr=-1),
                    sampler=dict(
                        type='RandomSampler',
                        num=64,
                        pos_fraction=0.5,
                        neg_pos_ub=-1,
                        add_gt_as_proposals=False),
                    allowed_border=-1,
                    pos_weight=-1,
                    debug=False),
                rpn_proposal=dict(
                    nms_pre=200,
                    max_per_img=200,
                    nms=dict(type='nms', iou_threshold=0.7),
                    min_bbox_size=0),
                rcnn=dict(
                    assigner=dict(
                        type='MaxIoUAssigner',
                        pos_iou_thr=0.5,
                        neg_iou_thr=0.5,
                        min_pos_iou=0.5,
                        match_low_quality=True,
                        ignore_iof_thr=-1),
                    sampler=dict(
                        type='RandomSampler',
                        num=128,
                        pos_fraction=0.25,
                        neg_pos_ub=-1,
                        add_gt_as_proposals=True),
                    mask_size=28,
                    pos_weight=-1,
                    debug=False)),
            test_cfg=dict(
                rpn=dict(
                    nms_pre=200,
                    max_per_img=200,
                    nms=dict(type='nms', iou_threshold=0.7),
                    min_bbox_size=0),
                rcnn=dict(
                    score_thr=0.01,
                    nms=dict(type='nms', iou_threshold=0.5),
                    max_per_img=100,
                    mask_thr_binary=0.5)),
            init_cfg=dict(
                type='Pretrained',
                checkpoint=
                'https://download.openmmlab.com/mmdetection/v2.0/mask_rcnn/mask_rcnn_r50_fpn_1x_coco/mask_rcnn_r50_fpn_1x_coco_20200205-d4b0c5d6.pth'
            )),
        type='MaskTrackRCNN',
        track_head=dict(
            type='RoITrackHead',
            roi_extractor=dict(
                type='SingleRoIExtractor',
                roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
                out_channels=256,
                featmap_strides=[4, 8, 16, 32]),
            embed_head=dict(
                type='RoIEmbedHead',
                num_fcs=2,
                roi_feat_size=7,
                in_channels=256,
                fc_out_channels=1024),
            train_cfg=dict(
                assigner=dict(
                    type='MaxIoUAssigner',
                    pos_iou_thr=0.5,
                    neg_iou_thr=0.5,
                    min_pos_iou=0.5,
                    match_low_quality=True,
                    ignore_iof_thr=-1),
                sampler=dict(
                    type='RandomSampler',
                    num=128,
                    pos_fraction=0.25,
                    neg_pos_ub=-1,
                    add_gt_as_proposals=True),
                pos_weight=-1,
                debug=False)),
        tracker=dict(
            type='MaskTrackRCNNTracker',
            match_weights=dict(det_score=1.0, iou=2.0, det_label=10.0),
            num_frames_retain=20))
    dataset_type = 'CocoDataset'
    data_root = '/home/music/Downloads/mmtracking-master/data/coco/'
    img_norm_cfg = dict(
        mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
    train_pipeline = [
        dict(type='LoadImageFromFile'),
        dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
        dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
        dict(type='RandomFlip', flip_ratio=0.5),
        dict(
            type='Normalize',
            mean=[123.675, 116.28, 103.53],
            std=[58.395, 57.12, 57.375],
            to_rgb=True),
        dict(type='Pad', size_divisor=32),
        dict(type='DefaultFormatBundle'),
        dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])
    ]
    test_pipeline = [
        dict(type='LoadImageFromFile'),
        dict(
            type='MultiScaleFlipAug',
            img_scale=(1333, 800),
            flip=False,
            transforms=[
                dict(type='Resize', keep_ratio=True),
                dict(type='RandomFlip'),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='Pad', size_divisor=32),
                dict(type='ImageToTensor', keys=['img']),
                dict(type='Collect', keys=['img'])
            ])
    ]
    data = dict(
        samples_per_gpu=6,
        workers_per_gpu=2,
        train=dict(
            type='CocoDataset',
            ann_file=
            '/home/music/Downloads/mmtracking-master/data/coco/annotations/train.json',
            img_prefix=
            '/home/music/Downloads/mmtracking-master/data/coco/train2023/',
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(type='LoadAnnotations', with_bbox=True, with_mask=True),
                dict(type='Resize', img_scale=(1333, 800), keep_ratio=True),
                dict(type='RandomFlip', flip_ratio=0.5),
                dict(
                    type='Normalize',
                    mean=[123.675, 116.28, 103.53],
                    std=[58.395, 57.12, 57.375],
                    to_rgb=True),
                dict(type='Pad', size_divisor=32),
                dict(type='DefaultFormatBundle'),
                dict(
                    type='Collect',
                    keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks'])
            ]),
        val=dict(
            type='CocoDataset',
            ann_file=
            '/home/music/Downloads/mmtracking-master/data/coco/annotations/val.json',
            img_prefix='/home/music/Downloads/mmtracking-master/data/coco/val2023/',
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(
                    type='MultiScaleFlipAug',
                    img_scale=(1333, 800),
                    flip=False,
                    transforms=[
                        dict(type='Resize', keep_ratio=True),
                        dict(type='RandomFlip'),
                        dict(
                            type='Normalize',
                            mean=[123.675, 116.28, 103.53],
                            std=[58.395, 57.12, 57.375],
                            to_rgb=True),
                        dict(type='Pad', size_divisor=32),
                        dict(type='ImageToTensor', keys=['img']),
                        dict(type='Collect', keys=['img'])
                    ])
            ]),
        test=dict(
            type='CocoDataset',
            ann_file=
            '/home/music/Downloads/mmtracking-master/data/coco/annotations/val.json',
            img_prefix='/home/music/Downloads/mmtracking-master/data/coco/val2023/',
            pipeline=[
                dict(type='LoadImageFromFile'),
                dict(
                    type='MultiScaleFlipAug',
                    img_scale=(1333, 800),
                    flip=False,
                    transforms=[
                        dict(type='Resize', keep_ratio=True),
                        dict(type='RandomFlip'),
                        dict(
                            type='Normalize',
                            mean=[123.675, 116.28, 103.53],
                            std=[58.395, 57.12, 57.375],
                            to_rgb=True),
                        dict(type='Pad', size_divisor=32),
                        dict(type='ImageToTensor', keys=['img']),
                        dict(type='Collect', keys=['img'])
                    ])
            ]))
    evaluation = dict(metric=['bbox', 'segm'], classwise=True)
    optimizer = dict(type='SGD', lr=0.00125, momentum=0.9, weight_decay=0.0001)
    optimizer_config = dict(grad_clip=dict(max_norm=35, norm_type=2))
    checkpoint_config = dict(interval=1)
    log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
    dist_params = dict(backend='nccl')
    log_level = 'INFO'
    load_from = None
    resume_from = None
    workflow = [('train', 1)]
    opencv_num_threads = 0
    mp_start_method = 'fork'
    lr_config = dict(
        policy='step',
        warmup='linear',
        warmup_iters=500,
        warmup_ratio=0.3333333333333333,
        step=[8, 11])
    total_epochs = 12
    work_dir = 'work_dir/masktrack_coco'
    gpu_ids = [0]
    

    Best wish! Thank you!

    opened by lijoe123 7
Releases(v1.0.0rc1)
  • v1.0.0rc1(Oct 11, 2022)

    MMTracking 1.0.0rc1 is the 2-nd version of MMTracking 1.x, a part of the OpenMMLab 2.0 projects.

    Built upon the new training engine, MMTracking 1.x unifies the interfaces of datasets, models, evaluation, and visualization.

    And there are some BC-breaking changes. Please check the migration tutorial for more details.

    We also support more methods in MMTracking 1.x, such as StrongSORT for MOT, Mask2Former for VIS, PrDiMP for SOT.

    Source code(tar.gz)
    Source code(zip)
  • v0.14.0(Sep 19, 2022)

    Highlights

    • Introduce the 1.0.0rc0 version of MMTracking (#725)

    New Features

    • Support OC-SORT method for MOT (#545)

    • Support multi-class tracking in ByteTrack (#548)

    • Support DanceTrack dataset for MOT (#543)

    • Support TAO dataset for QDTrack (#585)

    Source code(tar.gz)
    Source code(zip)
  • v1.0.0rc0(Aug 31, 2022)

    We recommend you to use MMTracking v1.0.0rc1 version, since the v1.0.0rc0 version has some bugs about the limitation of minimum version of mmdet.

    Source code(tar.gz)
    Source code(zip)
  • v0.13.0(Apr 29, 2022)

    Highlights

    • Support tracking colab tutorial (#511)

    New Features

    • Refactor the training datasets of SiamRPN++ (#496), (#518)

    • Support loading data from ceph for SOT datasets (#494)

    • Support loading data from ceph for MOT challenge dataset (#517)

    • Support evaluation metric for VIS task (#501)

    Bug Fixes

    • Fix a bug in the LaSOT datasets and update the pretrained models of STARK (#483), (#503)

    • Fix a bug in the format_results function of VIS task (#504)

    Source code(tar.gz)
    Source code(zip)
  • v0.12.0(Apr 1, 2022)

  • v0.11.0(Mar 4, 2022)

  • v0.10.0(Feb 10, 2022)

  • v0.9.0(Jan 6, 2022)

    Highlights

    • Support arXiv 2021 manuscript 'ByteTrack: Multi-Object Tracking by Associating Every Detection Box' (#385), (#383), (#372)
    • Support ICCV 2019 paper 'Video Instance Segmentation' (#304), (#303), (#298), (#292)

    New Features

    • Support CrowdHuman dataset for MOT (#366)
    • Support VOT2018 dataset for SOT (#305)
    • Support YouTube-VIS dataset for VIS (#290)

    Bug Fixes

    • Fix two significant bugs in SOT and provide new SOT pretrained models (#349)

    Improvements

    • Refactor LaSOT, TrackingNet dataset and support GOT-10K datasets (#296)
    • Support persisitent workers (#348)
    Source code(tar.gz)
    Source code(zip)
  • v0.8.0(Oct 3, 2021)

    New Features

    • Support OTB100 dataset in SOT (#271)
    • Support TrackingNet dataset in SOT (#268)
    • Support UAV123 dataset in SOT (#260)

    Bug Fixes

    • Fix a bug in mot_param_search.py (#270)

    Improvements

    • Use PyTorch sphinx theme (#274)
    • Use pycocotools instead of mmpycocotools (#263)
    Source code(tar.gz)
    Source code(zip)
  • v0.7.0(Sep 3, 2021)

    Highlights

    • Release code of AAAI 2021 paper 'Temporal ROI Align for Video Object Recognition' (#247)
    • Refactor English documentations (#243)
    • Add Chinese documentations (#248), (#250)

    New Features

    • Support fp16 training and testing (#230)
    • Release model using ResNeXt-101 as backbone for all VID methods (#254)
    • Support the results of Tracktor on MOT15, MOT16 and MOT20 datasets (#217)
    • Support visualization for single gpu test (#216)

    Bug Fixes

    • Fix a bug in MOTP evaluation (#235)
    • Fix two bugs in reid training and testing (#249)

    Improvements

    • Refactor anchor in SiameseRPN++ (#229)
    • Unify model initialization (#235)
    • Refactor unittest (#231)
    Source code(tar.gz)
    Source code(zip)
  • v0.6.0(Jul 30, 2021)

    Highlights

    • Fix training bugs of all three tasks (#219), (#221)

    New Features

    • Support error visualization for mot task (#212)

    Bug Fixes

    • Fix a bug in SOT demo (#213)

    Improvements

    • Use MMCV registry (#220)
    • Add README.md for reid training (#210)
    • Modify dict keys of the outputs of SOT (#223)
    • Add Chinese docs including install.md, quick_run.md, model_zoo.md, dataset.md (#205), (#214)
    Source code(tar.gz)
    Source code(zip)
  • v0.5.3(Jul 2, 2021)

  • v0.5.2(Jun 3, 2021)

  • v0.5.1(Feb 1, 2021)

  • v0.5.0(Jan 5, 2021)

    Highlights

    • MMTracking is released! It is the first open source toolbox that unifies versatile video perception tasks include single object tracking, multiple object tracking, and video object detection.

    New Features

    Source code(tar.gz)
    Source code(zip)
Official repository for MixFaceNets: Extremely Efficient Face Recognition Networks

MixFaceNets This is the official repository of the paper: MixFaceNets: Extremely Efficient Face Recognition Networks. (Accepted in IJCB2021) https://i

Fadi Boutros 51 Dec 13, 2022
(Python, R, C/C++) Isolation Forest and variations such as SCiForest and EIF, with some additions (outlier detection + similarity + NA imputation)

IsoTree Fast and multi-threaded implementation of Extended Isolation Forest, Fair-Cut Forest, SCiForest (a.k.a. Split-Criterion iForest), and regular

141 Dec 29, 2022
PyTorch implementation of "A Simple Baseline for Low-Budget Active Learning".

A Simple Baseline for Low-Budget Active Learning This repository is the implementation of A Simple Baseline for Low-Budget Active Learning. In this pa

10 Nov 14, 2022
Label-Free Model Evaluation with Semi-Structured Dataset Representations

Label-Free Model Evaluation with Semi-Structured Dataset Representations Prerequisites This code uses the following libraries Python 3.7 NumPy PyTorch

8 Oct 06, 2022
One Million Scenes for Autonomous Driving

ONCE Benchmark This is a reproduced benchmark for 3D object detection on the ONCE (One Million Scenes) dataset. The code is mainly based on OpenPCDet.

148 Dec 28, 2022
Tutoriais publicados nas nossas redes sociais para obtenção de dados, análises simples e outras tarefas relevantes no mercado financeiro.

Tutoriais Públicos Tutoriais publicados nas nossas redes sociais para obtenção de dados, análises simples e outras tarefas relevantes no mercado finan

Trading com Dados 68 Oct 15, 2022
Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak.

DeepCreamPy Decensoring Hentai with Deep Neural Networks. Formerly named DeepMindBreak. A deep learning-based tool to automatically replace censored a

616 Jan 06, 2023
[2021 MultiMedia] CONQUER: Contextual Query-aware Ranking for Video Corpus Moment Retrieval

CONQUER: Contexutal Query-aware Ranking for Video Corpus Moment Retreival PyTorch implementation of CONQUER: Contexutal Query-aware Ranking for Video

Hou zhijian 23 Dec 26, 2022
Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

Implementation of "Distribution Alignment: A Unified Framework for Long-tail Visual Recognition"(CVPR 2021)

105 Nov 07, 2022
Pytorch implementation of BRECQ, ICLR 2021

BRECQ Pytorch implementation of BRECQ, ICLR 2021 @inproceedings{ li&gong2021brecq, title={BRECQ: Pushing the Limit of Post-Training Quantization by Bl

Yuhang Li 148 Dec 28, 2022
A Broader Picture of Random-walk Based Graph Embedding

Random-walk Embedding Framework This repository is a reference implementation of the random-walk embedding framework as described in the paper: A Broa

Zexi Huang 23 Dec 13, 2022
Vikrant Deshpande 1 Nov 17, 2022
PyTorch implementation of SQN based on CloserLook3D's encoder

SQN_pytorch This repo is an implementation of Semantic Query Network (SQN) using CloserLook3D's encoder in Pytorch. For TensorFlow implementation, che

PointCloudYC 1 Oct 21, 2021
Streaming over lightweight data transformations

Description Data augmentation libarary for Deep Learning, which supports images, segmentation masks, labels and keypoints. Furthermore, SOLT is fast a

Research Unit of Medical Imaging, Physics and Technology 256 Jan 08, 2023
RDA: Robust Domain Adaptation via Fourier Adversarial Attacking

RDA: Robust Domain Adaptation via Fourier Adversarial Attacking Updates 08/2021: check out our domain adaptation for video segmentation paper Domain A

17 Nov 30, 2022
This is the offical website for paper ''Category-consistent deep network learning for accurate vehicle logo recognition''

The Pytorch Implementation of Category-consistent deep network learning for accurate vehicle logo recognition This is the offical website for paper ''

Wanglong Lu 28 Oct 29, 2022
Defending against Model Stealing via Verifying Embedded External Features

Defending against Model Stealing Attacks via Verifying Embedded External Features This is the official implementation of our paper Defending against M

20 Dec 30, 2022
AirPose: Multi-View Fusion Network for Aerial 3D Human Pose and Shape Estimation

AirPose AirPose: Multi-View Fusion Network for Aerial 3D Human Pose and Shape Estimation Check the teaser video This repository contains the code of A

Robot Perception Group 41 Dec 05, 2022
Python package for multiple object tracking research with focus on laboratory animals tracking.

motutils is a Python package for multiple object tracking research with focus on laboratory animals tracking. Features loads: MOTChallenge CSV, sleap

Matěj Šmíd 2 Sep 05, 2022
PyExplainer: A Local Rule-Based Model-Agnostic Technique (Explainable AI)

PyExplainer PyExplainer is a local rule-based model-agnostic technique for generating explanations (i.e., why a commit is predicted as defective) of J

AI Wizards for Software Management (AWSM) Research Group 14 Nov 13, 2022