A pytorch implementation of faster RCNN detection framework (Use detectron2, it's a masterpiece)

Overview

Notice(2019.11.2)

This repo was built back two years ago when there were no pytorch detection implementation that can achieve reasonable performance. At this time, there are many better repos out there, for example:

Therefore, this repo will not be actively maintained.

Important notice:

If you used the master branch before Sep. 26 2017 and its corresponding pretrained model, PLEASE PAY ATTENTION: The old master branch in now under old_master, you can still run the code and download the pretrained model, but the pretrained model for that old master is not compatible to the current master!

The main differences between new and old master branch are in this two commits: 9d4c24e, c899ce7 The change is related to this issue; master now matches all the details in tf-faster-rcnn so that we can now convert pretrained tf model to pytorch model.

pytorch-faster-rcnn

A pytorch implementation of faster RCNN detection framework based on Xinlei Chen's tf-faster-rcnn. Xinlei Chen's repository is based on the python Caffe implementation of faster RCNN available here.

Note: Several minor modifications are made when reimplementing the framework, which give potential improvements. For details about the modifications and ablative analysis, please refer to the technical report An Implementation of Faster RCNN with Study for Region Sampling. If you are seeking to reproduce the results in the original paper, please use the official code or maybe the semi-official code. For details about the faster RCNN architecture please refer to the paper Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks.

Detection Performance

The current code supports VGG16, Resnet V1 and Mobilenet V1 models. We mainly tested it on plain VGG16 and Resnet101 architecture. As the baseline, we report numbers using a single model on a single convolution layer, so no multi-scale, no multi-stage bounding box regression, no skip-connection, no extra input is used. The only data augmentation technique is left-right flipping during training following the original Faster RCNN. All models are released.

With VGG16 (conv5_3):

  • Train on VOC 2007 trainval and test on VOC 2007 test, 71.22(from scratch) 70.75(converted) (70.8 for tf-faster-rcnn).
  • Train on VOC 2007+2012 trainval and test on VOC 2007 test (R-FCN schedule), 75.33(from scratch) 75.27(converted) (75.7 for tf-faster-rcnn).
  • Train on COCO 2014 trainval35k and test on minival (900k/1190k) 29.2(from scratch) 30.1(converted) (30.2 for tf-faster-rcnn).

With Resnet101 (last conv4):

  • Train on VOC 2007 trainval and test on VOC 2007 test, 75.29(from scratch) 75.76(converted) (75.7 for tf-faster-rcnn).
  • Train on VOC 2007+2012 trainval and test on VOC 2007 test (R-FCN schedule), 79.26(from scratch) 79.78(converted) (79.8 for tf-faster-rcnn).
  • Train on COCO 2014 trainval35k and test on minival (800k/1190k), 35.1(from scratch) 35.4(converted) (35.4 for tf-faster-rcnn).

More Results:

  • Train Mobilenet (1.0, 224) on COCO 2014 trainval35k and test on minival (900k/1190k), 21.4(from scratch), 21.9(converted) (21.8 for tf-faster-rcnn).
  • Train Resnet50 on COCO 2014 trainval35k and test on minival (900k/1190k), 32.4(converted) (32.4 for tf-faster-rcnn).
  • Train Resnet152 on COCO 2014 trainval35k and test on minival (900k/1190k), 36.7(converted) (36.1 for tf-faster-rcnn).

Approximate baseline setup from FPN (this repository does not contain training code for FPN yet):

  • Train Resnet50 on COCO 2014 trainval35k and test on minival (900k/1190k), 34.2.
  • Train Resnet101 on COCO 2014 trainval35k and test on minival (900k/1190k), 37.4.
  • Train Resnet152 on COCO 2014 trainval35k and test on minival (900k/1190k), 38.2.

Note:

  • Due to the randomness in GPU training especially for VOC, the best numbers are reported (with 2-3 attempts) here. According to Xinlei's experience, for COCO you can almost always get a very close number (within ~0.2%) despite the randomness.

  • The numbers are obtained with the default testing scheme which selects region proposals using non-maximal suppression (TEST.MODE nms), the alternative testing scheme (TEST.MODE top) will likely result in slightly better performance (see report, for COCO it boosts 0.X AP).

  • Since we keep the small proposals (< 16 pixels width/height), our performance is especially good for small objects.

  • We do not set a threshold (instead of 0.05) for a detection to be included in the final result, which increases recall.

  • Weight decay is set to 1e-4.

  • For other minor modifications, please check the report. Notable ones include using crop_and_resize, and excluding ground truth boxes in RoIs during training.

  • For COCO, we find the performance improving with more iterations, and potentially better performance can be achieved with even more iterations.

  • For Resnets, we fix the first block (total 4) when fine-tuning the network, and only use crop_and_resize to resize the RoIs (7x7) without max-pool (which Xinlei finds useless especially for COCO). The final feature maps are average-pooled for classification and regression. All batch normalization parameters are fixed. Learning rate for biases is not doubled.

  • For Mobilenets, we fix the first five layers when fine-tuning the network. All batch normalization parameters are fixed. Weight decay for Mobilenet layers is set to 4e-5.

  • For approximate FPN baseline setup we simply resize the image with 800 pixels, add 32^2 anchors, and take 1000 proposals during testing.

  • Check out here/here/here for the latest models, including longer COCO VGG16 models and Resnet ones.

Displayed Ground Truth on Tensorboard Displayed Predictions on Tensorboard

Additional features

Additional features not mentioned in the report are added to make research life easier:

  • Support for train-and-validation. During training, the validation data will also be tested from time to time to monitor the process and check potential overfitting. Ideally training and validation should be separate, where the model is loaded every time to test on validation. However Xinlei have implemented it in a joint way to save time and GPU memory. Though in the default setup the testing data is used for validation, no special attempts is made to overfit on testing set.
  • Support for resuming training. Xinlei tried to store as much information as possible when snapshoting, with the purpose to resume training from the latest snapshot properly. The meta information includes current image index, permutation of images, and random state of numpy. However, when you resume training the random seed for tensorflow will be reset (not sure how to save the random state of tensorflow now), so it will result in a difference. Note that, the current implementation still cannot force the model to behave deterministically even with the random seeds set. Suggestion/solution is welcome and much appreciated.
  • Support for visualization. The current implementation will summarize ground truth boxes, statistics of losses, activations and variables during training, and dump it to a separate folder for tensorboard visualization. The computing graph is also saved for debugging.

Prerequisites

  • A basic pytorch installation. The code follows 1.0. If you are using old 0.1.12 or 0.2 or 0.3 or 0.4, you can checkout the corresponding branch.
  • Torchvision 0.3. This code uses torchvision.ops for nms, roi_pool and roi_align
  • Python packages you might not have: opencv-python, easydict (similar to py-faster-rcnn). For easydict make sure you have the right version. Xinlei uses 1.6.
  • tensorboard-pytorch to visualize the training and validation curve. Please build from source to use the latest tensorflow-tensorboard.
  • Docker users: Since the recent upgrade, the docker image on docker hub (https://hub.docker.com/r/mbuckler/tf-faster-rcnn-deps/) is no longer valid. However, you can still build your own image by using dockerfile located at docker folder (cuda 8 version, as it is required by Tensorflow r1.0.) And make sure following Tensorflow installation to install and use nvidia-docker[https://github.com/NVIDIA/nvidia-docker]. Last, after launching the container, you have to build the Cython modules within the running container.

Installation

  1. Clone the repository
git clone https://github.com/ruotianluo/pytorch-faster-rcnn.git
  1. Install the Python COCO API. The code requires the API to access COCO dataset.
cd data
git clone https://github.com/pdollar/coco.git
cd coco/PythonAPI
make
cd ../../..

Setup data

Please follow the instructions of py-faster-rcnn here to setup VOC and COCO datasets (Part of COCO is done). The steps involve downloading data and optionally creating soft links in the data folder. Since faster RCNN does not rely on pre-computed proposals, it is safe to ignore the steps that setup proposals.

If you find it useful, the data/cache folder created on Xinlei's side is also shared here.

Demo and Test with pre-trained models

  1. Download pre-trained model (only google drive works)
  • Another server here.
  • Google drive here.

(Optional) Instead of downloading my pretrained or converted model, you can also convert from tf-faster-rcnn model. You can download the tensorflow pretrained model from tf-faster-rcnn. Then run:

python tools/convert_from_tensorflow.py --tensorflow_model resnet_model.ckpt 
python tools/convert_from_tensorflow_vgg.py --tensorflow_model vgg_model.ckpt

This script will create a .pth file with the same name in the same folder as the tensorflow model.

  1. Create a folder and a soft link to use the pre-trained model
NET=res101
TRAIN_IMDB=voc_2007_trainval+voc_2012_trainval
mkdir -p output/${NET}/${TRAIN_IMDB}
cd output/${NET}/${TRAIN_IMDB}
ln -s ../../../data/voc_2007_trainval+voc_2012_trainval ./default
cd ../../..
  1. Demo for testing on custom images
# at repository root
GPU_ID=0
CUDA_VISIBLE_DEVICES=${GPU_ID} ./tools/demo.py

Note: Resnet101 testing probably requires several gigabytes of memory, so if you encounter memory capacity issues, please install it with CPU support only. Refer to Issue 25.

  1. Test with pre-trained Resnet101 models
GPU_ID=0
./experiments/scripts/test_faster_rcnn.sh $GPU_ID pascal_voc_0712 res101

Note: If you cannot get the reported numbers (79.8 on my side), then probably the NMS function is compiled improperly, refer to Issue 5.

Train your own model

  1. Download pre-trained models and weights. The current code support VGG16 and Resnet V1 models. Pre-trained models are provided by pytorch-vgg and pytorch-resnet (the ones with caffe in the name), you can download the pre-trained models and set them in the data/imagenet_weights folder. For example for VGG16 model, you can set up like:

    mkdir -p data/imagenet_weights
    cd data/imagenet_weights
    python # open python in terminal and run the following Python code
    import torch
    from torch.utils.model_zoo import load_url
    from torchvision import models
    
    sd = load_url("https://s3-us-west-2.amazonaws.com/jcjohns-models/vgg16-00b39a1b.pth")
    sd['classifier.0.weight'] = sd['classifier.1.weight']
    sd['classifier.0.bias'] = sd['classifier.1.bias']
    del sd['classifier.1.weight']
    del sd['classifier.1.bias']
    
    sd['classifier.3.weight'] = sd['classifier.4.weight']
    sd['classifier.3.bias'] = sd['classifier.4.bias']
    del sd['classifier.4.weight']
    del sd['classifier.4.bias']
    
    torch.save(sd, "vgg16.pth")
    cd ../..

    For Resnet101, you can set up like:

    mkdir -p data/imagenet_weights
    cd data/imagenet_weights
    # download from my gdrive (link in pytorch-resnet)
    mv resnet101-caffe.pth res101.pth
    cd ../..

    For Mobilenet V1, you can set up like:

    mkdir -p data/imagenet_weights
    cd data/imagenet_weights
    # download from my gdrive (https://drive.google.com/open?id=0B7fNdx_jAqhtZGJvZlpVeDhUN1k)
    mv mobilenet_v1_1.0_224.pth.pth mobile.pth
    cd ../..
  2. Train (and test, evaluation)

./experiments/scripts/train_faster_rcnn.sh [GPU_ID] [DATASET] [NET]
# GPU_ID is the GPU you want to test on
# NET in {vgg16, res50, res101, res152} is the network arch to use
# DATASET {pascal_voc, pascal_voc_0712, coco} is defined in train_faster_rcnn.sh
# Examples:
./experiments/scripts/train_faster_rcnn.sh 0 pascal_voc vgg16
./experiments/scripts/train_faster_rcnn.sh 1 coco res101

Note: Please double check you have deleted soft link to the pre-trained models before training. If you find NaNs during training, please refer to Issue 86. Also if you want to have multi-gpu support, check out Issue 121.

  1. Visualization with Tensorboard
tensorboard --logdir=tensorboard/vgg16/voc_2007_trainval/ --port=7001 &
tensorboard --logdir=tensorboard/vgg16/coco_2014_train+coco_2014_valminusminival/ --port=7002 &
  1. Test and evaluate
./experiments/scripts/test_faster_rcnn.sh [GPU_ID] [DATASET] [NET]
# GPU_ID is the GPU you want to test on
# NET in {vgg16, res50, res101, res152} is the network arch to use
# DATASET {pascal_voc, pascal_voc_0712, coco} is defined in test_faster_rcnn.sh
# Examples:
./experiments/scripts/test_faster_rcnn.sh 0 pascal_voc vgg16
./experiments/scripts/test_faster_rcnn.sh 1 coco res101
  1. You can use tools/reval.sh for re-evaluation

By default, trained networks are saved under:

output/[NET]/[DATASET]/default/

Test outputs are saved under:

output/[NET]/[DATASET]/default/[SNAPSHOT]/

Tensorboard information for train and validation is saved under:

tensorboard/[NET]/[DATASET]/default/
tensorboard/[NET]/[DATASET]/default_val/

The default number of training iterations is kept the same to the original faster RCNN for VOC 2007, however Xinlei finds it is beneficial to train longer (see report for COCO), probably due to the fact that the image batch size is one. For VOC 07+12 we switch to a 80k/110k schedule following R-FCN. Also note that due to the nondeterministic nature of the current implementation, the performance can vary a bit, but in general it should be within ~1% of the reported numbers for VOC, and ~0.2% of the reported numbers for COCO. Suggestions/Contributions are welcome.

Citation

If you find this implementation or the analysis conducted in our report helpful, please consider citing:

@article{chen17implementation,
    Author = {Xinlei Chen and Abhinav Gupta},
    Title = {An Implementation of Faster RCNN with Study for Region Sampling},
    Journal = {arXiv preprint arXiv:1702.02138},
    Year = {2017}
}

For convenience, here is the faster RCNN citation:

@inproceedings{renNIPS15fasterrcnn,
    Author = {Shaoqing Ren and Kaiming He and Ross Girshick and Jian Sun},
    Title = {Faster {R-CNN}: Towards Real-Time Object Detection
             with Region Proposal Networks},
    Booktitle = {Advances in Neural Information Processing Systems ({NIPS})},
    Year = {2015}
}

Detailed numbers from COCO server (not supported)

All the models are trained on COCO 2014 trainval35k.

VGG16 COCO 2015 test-dev (900k/1190k):

 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.297
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.504
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.312
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.128
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.325
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.421
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.272
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.399
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.409
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.187
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.451
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.591

VGG16 COCO 2015 test-std (900k/1190k):

Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.295
Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.501
Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.312
Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.119
Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.327
Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.418
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.273
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.400
Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.409
Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.179
Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.455
Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.586
Comments
  • Poor results

    Poor results

    Hi, I using TITANX, Ubuntu 16.04, Cuda8, Cudnn 5, and train the model as readme. However, I can not get the result mentioned in README. I have two problems.

    1. In the beginning of training, sometime the loss will become nan , it seems there are something wrong in initialization, but I can not find the bug in the code.
    iter: 20 / 110000, total loss: nan
     >>> rpn_loss_cls: 0.691554
     >>> rpn_loss_box: 0.019584
     >>> loss_cls: nan
     >>> loss_box: nan
     >>> lr: 0.001000
    speed: 0.382s / iter
    
    1. When losses are not nan, I get pool results: Mean AP = 0.6542 or Mean AP = 0.5809. Train on VOC 2007+2012 trainval and test on VOC 2007 by using default config.

    Can you help me?

    opened by philokey 18
  • tensorflow-converted model failed in demo.py with 'KeyError'

    tensorflow-converted model failed in demo.py with 'KeyError'

    I first convert from tf-faster-rcnn model. http://xinlei.sp.cs.cmu.edu/xinleic/tf-faster-rcnn/res101/voc_2007_50-70k.tgz & http://xinlei.sp.cs.cmu.edu/xinleic/tf-faster-rcnn/vgg16/voc_2007_50k-70k.tgz and run tool/demo.py, but failed with: (ubuntu14.04, cuda8.0, cudnn6, python3.6)

    res101+pascal_voc

    Traceback (most recent call last):
      File "tools/demo.py", line 143, in <module>
        net.load_state_dict(torch.load(saved_model))
      File "/home/ljy/pytorch-examples-master/pytorch-faster-rcnn-master/tools/../lib/nets/network.py", line 481, in load_state_dict
        nn.Module.load_state_dict(self, {k: state_dict[k] for k in list(self.state_dict())})
      File "/home/ljy/pytorch-examples-master/pytorch-faster-rcnn-master/tools/../lib/nets/network.py", line 481, in <dictcomp>
        nn.Module.load_state_dict(self, {k: state_dict[k] for k in list(self.state_dict())})
    KeyError: 'resnet.bn1.bias'
    

    vgg16+pascal_voc

    Traceback (most recent call last):
      File "tools/demo.py", line 143, in <module>
        net.load_state_dict(torch.load(saved_model))
      File "/home/ljy/pytorch-examples-master/pytorch-faster-rcnn-master/tools/../lib/nets/network.py", line 481, in load_state_dict
        nn.Module.load_state_dict(self, {k: state_dict[k] for k in list(self.state_dict())})
      File "/home/ljy/pytorch-examples-master/pytorch-faster-rcnn-master/tools/../lib/nets/network.py", line 481, in <dictcomp>
        nn.Module.load_state_dict(self, {k: state_dict[k] for k in list(self.state_dict())})
    KeyError: 'vgg.features.14.bias'
    

    Something wrong with tensorflow-converted model?

    opened by JingyunLiang 13
  • error in proposal target layer

    error in proposal target layer

    iter: 820 / 490000, total loss: 1.824865

    rpn_loss_cls: 0.154106 rpn_loss_box: 0.004902 loss_cls: 1.164764 loss_box: 0.501092 lr: 0.001000 speed: 0.779s / iter /scratch0/pytorch-faster-rcnn/lib/layer_utils/proposal_target_layer.py(142)_sample_rois() -> keep_inds = torch.cat([fg_inds, bg_inds], 0) (Pdb) fg_inds [torch.cuda.LongTensor with no dimension]

    (Pdb) bg_inds [torch.cuda.LongTensor with no dimension]

    So there's a breakpoint which was already there. I am training on pascal and coco. After running for few iterations it goes to this breakpoint which seems like is due to 0 fgs and 0 bgs. This shouldn't be the case. Am I missing something here? Should I lower the thresholds for fg/bg ?

    opened by akumar14 11
  • Why both

    Why both "fg_inds.numel()=0 and bg_inds.numel()=0" not handled in proposal_target_layer.py?

    In proposal_target_layer.py,

      # Small modification to the original version where we ensure a fixed number of regions are sampled
      if fg_inds.numel() > 0 and bg_inds.numel() > 0:
        ...
      elif fg_inds.numel() > 0:
        ...
      elif bg_inds.numel() > 0:
        ...
      else:
        import pdb
        pdb.set_trace()
    
    

    The case when both fg_inds.numel()=0 and bg_inds.numel()=0 is not implemented.

    When GT roi is not used in training rcnn (set via the cfg TRAIN.USE_GT flag), fg_inds.numel() can be zero. Note, this is not the case for original faster rcnn where fg_inds.numel() != 0.

    Also, the cfg TRAIN.BG_THRESH_HI and TRAIN.BG_THRESH_LO values can also cause bg_inds.numel()=0.

    Since both "fg_inds.numel()=0 and bg_inds.numel()=0" is a possible case, I wonder why is it not handled? In this case why don't just random select the roi and set them as negative?

    opened by hengck23 11
  • Memory usage when train a model

    Memory usage when train a model

    Hi, Do you know how to calculate the maximum GPU memory usage when train a model? So that I can change the SCALE size of input image. Actually I found that when I trained the model with the original tensorflow version, the SCALE of input image could be larger than this pytorch version. (When training with a big SCALE(about 800 * 1000), pytorch version will be out of memory)

    Thanks very much!

    opened by sshaoshuai 10
  • a soultion to KeyError:'resnet.bn1.num_batches_tracked'

    a soultion to KeyError:'resnet.bn1.num_batches_tracked'

    Nowadys, when i implement the code to train my own data. Using the command:) NOTE:I USE THE PRETRAINED_WEIGHTS RESNET101 FOR MY NEW TRAINING ./train_faster_rcnn.sh 0 pascal_voc res101 but an problem happens that is 'KeyError:'resnet.bn1.num_batches_tracked''. I went into deeper, and find this problem occurs when load the pretrained_model. In lib/model/train_val.py line 'self.net.load_pretrained_cnn(torch.load(self.pretrained_model))' before go into this line: self.net is the faster_rcnn structure, but when i debug into 'self.net.load_pretrained_cnn'function, when i look up the structure it became the normal resnet101 network. at that situation, the code will throw KeyError. So i simple move the 'load_pretrained_cnn' from resnet to train_val.py. more details are shown in blow. @FX$_RQBPQ~ML~K1($AG68A FLNPZ29}J@PQBDQG5MUP VF

    after doing that. my test will run as normal.

    opened by Owen-Fish 9
  • Zero RPN bias

    Zero RPN bias

    Hi @ruotianluo , i did a train on new model on PASVAL_VOC dataset with Vgg16 (CPU only with roi_pooling, Python 3.6.3) downloaded from https://github.com/jcjohnson/pytorch-vgg. However, i found that there is always zero at the bias of rpn_net, rpn_cls_score_net, rpn_bbox_pred_net, cls_score_net, bbox_pred_net. Is it an error ? If yes, may i know which part should be modified or change ?

    opened by bk-00 9
  • key error when train by myself with

    key error when train by myself with "./experiments/scripts/train_faster_rcnn.sh 0 pascal_voc res101"

    When I train the model by myself, it raise a KeyError when load res101 parameters from pre-trained model. How can I solve this.

    Loading initial model weights from data/imagenet_weights/res101.pth Traceback (most recent call last): File "./tools/trainval_net.py", line 138, in max_iters=args.max_iters) File "/home/hanbing/ExtraDisk/GithubProject/pytorch-faster-rcnn/tools/../lib/model/train_val.py", line 348, in train_net sw.train_model(max_iters) File "/home/hanbing/ExtraDisk/GithubProject/pytorch-faster-rcnn/tools/../lib/model/train_val.py", line 224, in train_model lr, last_snapshot_iter, stepsizes, np_paths, ss_paths = self.initialize() File "/home/hanbing/ExtraDisk/GithubProject/pytorch-faster-rcnn/tools/../lib/model/train_val.py", line 167, in initialize self.net.load_pretrained_cnn(torch.load(self.pretrained_model)) File "/home/hanbing/ExtraDisk/GithubProject/pytorch-faster-rcnn/tools/../lib/nets/resnet_v1.py", line 177, in load_pretrained_cnn self.resnet.load_state_dict({k: state_dict[k] for k in list(self.resnet.state_dict())}) File "/home/hanbing/ExtraDisk/GithubProject/pytorch-faster-rcnn/tools/../lib/nets/resnet_v1.py", line 177, in self.resnet.load_state_dict({k: state_dict[k] for k in list(self.resnet.state_dict())}) KeyError: 'bn1.num_batches_tracked' Command exited with non-zero status 1 3.40user 0.37system 0:03.78elapsed 99%CPU (0avgtext+0avgdata 577576maxresident)k 0inputs+16outputs (0major+146467minor)pagefaults 0swaps

    opened by BrightXiaoHan 8
  • undefined symbol: state

    undefined symbol: state

    When I try to run the demo, I get the following error: [email protected]:~/workspace/jiyy/pytorch-faster-rcnn$ CUDA_VISIBLE_DEVICES=0 ./tools/demo.py Traceback (most recent call last): File "./tools/demo.py", line 22, in from model.test import im_detect File "/home/pattern/workspace/jiyy/pytorch-faster-rcnn/tools/../lib/model/test.py", line 20, in from model.nms_wrapper import nms File "/home/pattern/workspace/jiyy/pytorch-faster-rcnn/tools/../lib/model/nms_wrapper.py", line 11, in from nms.pth_nms import pth_nms File "/home/pattern/workspace/jiyy/pytorch-faster-rcnn/tools/../lib/nms/pth_nms.py", line 2, in from ._ext import nms File "/home/pattern/workspace/jiyy/pytorch-faster-rcnn/tools/../lib/nms/_ext/nms/init.py", line 3, in from ._nms import lib as _lib, ffi as _ffi ImportError: /home/pattern/workspace/jiyy/pytorch-faster-rcnn/tools/../lib/nms/_ext/nms/_nms.so: undefined symbol: state

    opened by aliceyayunji 8
  • torch.utils.ffi is deprecated

    torch.utils.ffi is deprecated

    Trying to run ./make.sh calls on python build.py in line 14, which has from torch.utils.ffi import create_extension in line 3. Which causes the following error:

    ImportError: torch.utils.ffi is deprecated. Please use cpp extensions instead.

    Could you update your code to use cpp extensions instead?

    opened by StrangeTcy 7
  • PyTorch 1.0 Support

    PyTorch 1.0 Support

    Updated the code base to support Pytorch 1.0 based on maskrcnn-benchmark and faster-rcnn.pytorch.

    P.S.: Needless to say, but there are breaking changes from Pytorch 0.4.0, so you may want to mantain a seperate branch for that.

    opened by adityaarun1 7
  • Anchors problem with resnet152

    Anchors problem with resnet152

    when use pretrained model with Rsenet152,it occurs: image I guess it should be 12 anchors in rpn, but when i check cfg i found that there are 9 anchors image if i need to modify it, what parameters should be set?

    opened by moment-ggw 1
  • Custom VOC dataset

    Custom VOC dataset

    When I use a custom VOC format dataset, because the category is only 14 and does not match the weight of the pre training, the following error will be reported. How can I modify it??

    size mismatch for cls_score_net.weight: copying a param with shape torch.Size([1001, 2048]) from checkpoint, the shape in current model is torch.Size([15, 2048]). size mismatch for cls_score_net.bias: copying a param with shape torch.Size([1001]) from checkpoint, the shape in current model is torch.Size([15]). size mismatch for bbox_pred_net.weight: copying a param with shape torch.Size([4004, 2048]) from checkpoint, the shape in current model is torch.Size([60, 2048]). size mismatch for bbox_pred_net.bias: copying a param with shape torch.Size([4004]) from checkpoint, the shape in current model is torch.Size([60]).

    opened by sure7018 0
  • _mask.so: undefined symbol: _Py_ZeroStruct

    _mask.so: undefined symbol: _Py_ZeroStruct

    When I run the following command"./experiments/scripts/test_faster_rcnn.sh $GPU_ID pascal_voc vgg16"

    Such an error occurred:

    Logging output to experiments/logs/test_vgg16_voc_2007_trainval_.txt.2021-06-02_22-43-16

    • [[ ! -z '' ]]
    • CUDA_VISIBLE_DEVICES=0
    • time python ./tools/test_net.py --imdb voc_2007_test --model output/vgg16/voc_2007_trainval/default/vgg16_faster_rcnn_iter_70000.pth --cfg experiments/cfgs/vgg16.yml --net vgg16 --set ANCHOR_SCALES '[8,16,32]' ANCHOR_RATIOS '[0.5,1,2]' Traceback (most recent call last): File "./tools/test_net.py", line 13, in from datasets.factory import get_imdb File "/home/xhay/pytorch-faster-rcnn/tools/../lib/datasets/factory.py", line 14, in from datasets.coco import coco File "/home/xhay/pytorch-faster-rcnn/tools/../lib/datasets/coco.py", line 23, in from pycocotools.coco import COCO File "/home/xhay/pytorch-faster-rcnn/tools/../data/coco/PythonAPI/pycocotools/coco.py", line 55, in from . import mask as maskUtils File "/home/xhay/pytorch-faster-rcnn/tools/../data/coco/PythonAPI/pycocotools/mask.py", line 3, in import pycocotools._mask as _mask ImportError: /home/xhay/pytorch-faster-rcnn/tools/../data/coco/PythonAPI/pycocotools/_mask.so: undefined symbol: _Py_ZeroStruct Command exited with non-zero status 1 1.61user 0.40system 0:08.87elapsed 22%CPU (0avgtext+0avgdata 267140maxresident)k 402208inputs+0outputs (1250major+34960minor)pagefaults 0swaps

    Who can help me solve this problem? Please,Thank you!

    opened by ypxingxing 7
  • train on pytorch1.5

    train on pytorch1.5

    It is all right when I only switch the torch version to 1.1. But I find the loss is nan after 20 iters when I try to train the model on torch1.5. I confirm that only the PyTorch version is different. Do you know what this is about?

    opened by xingaoli 0
Releases(2.0)
  • 2.0(Apr 25, 2018)

    The code has been updated to pytorch 0.4.

    Thanks to the .to(device), now the model can be simply change between cpu and cuda. Defaulty, if there is no cuda device, when you run test or demo, the model will automatically run under cpu.

    Source code(tar.gz)
    Source code(zip)
  • 1.0(Oct 20, 2017)

Owner
Ruotian(RT) Luo
Phd student at TTIC
Ruotian(RT) Luo
A tiny, friendly, strong baseline code for Person-reID (based on pytorch).

Pytorch ReID Strong, Small, Friendly A tiny, friendly, strong baseline code for Person-reID (based on pytorch). Strong. It is consistent with the new

Zhedong Zheng 3.5k Jan 08, 2023
A bare-bones Python library for quality diversity optimization.

pyribs Website Source PyPI Conda CI/CD Docs Docs Status Twitter pyribs.org GitHub docs.pyribs.org A bare-bones Python library for quality diversity op

ICAROS 127 Jan 06, 2023
Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)

StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery (ICCV 2021 Oral) Run this model on Replicate Optimization: Global directions: Mapper: Check ou

3.3k Jan 05, 2023
Automatic number plate recognition using tech: Yolo, OCR, Scene text detection, scene text recognation, flask, torch

Automatic Number Plate Recognition Automatic Number Plate Recognition (ANPR) is the process of reading the characters on the plate with various optica

Meftun AKARSU 52 Dec 22, 2022
Code repository for Self-supervised Structure-sensitive Learning, CVPR'17

Self-supervised Structure-sensitive Learning (SSL) Ke Gong, Xiaodan Liang, Xiaohui Shen, Liang Lin, "Look into Person: Self-supervised Structure-sensi

Clay Gong 219 Dec 29, 2022
Experiments and examples converting Transformers to ONNX

Experiments and examples converting Transformers to ONNX This repository containes experiments and examples on converting different Transformers to ON

Philipp Schmid 4 Dec 24, 2022
Tacotron 2 - PyTorch implementation with faster-than-realtime inference

Tacotron 2 (without wavenet) PyTorch implementation of Natural TTS Synthesis By Conditioning Wavenet On Mel Spectrogram Predictions. This implementati

NVIDIA Corporation 4.1k Jan 03, 2023
Implementation of ICCV21 paper: PnP-DETR: Towards Efficient Visual Analysis with Transformers

Implementation of ICCV 2021 paper: PnP-DETR: Towards Efficient Visual Analysis with Transformers arxiv This repository is based on detr Recently, DETR

twang 113 Dec 27, 2022
[ICML'21] Estimate the accuracy of the classifier in various environments through self-supervision

What Does Rotation Prediction Tell Us about Classifier Accuracy under Varying Testing Environments? [Paper] [ICML'21 Project] PyTorch Implementation T

24 Oct 26, 2022
Practical and Real-world applications of ML based on the homework of Hung-yi Lee Machine Learning Course 2021

Machine Learning Theory and Application Overview This repository is inspired by the Hung-yi Lee Machine Learning Course 2021. In that course, professo

SilenceJiang 35 Nov 22, 2022
An Open-Source Package for Information Retrieval.

OpenMatch An Open-Source Package for Information Retrieval. 😃 What's New Top Spot on TREC-COVID Challenge (May 2020, Round2) The twin goals of the ch

THUNLP 439 Dec 27, 2022
Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation. Intel iHD GPU (iGPU) support. NVIDIA GPU (dGPU) support.

mtomo Multiple types of NN model optimization environments. It is possible to directly access the host PC GUI and the camera to verify the operation.

Katsuya Hyodo 24 Mar 02, 2022
Neural Dynamic Policies for End-to-End Sensorimotor Learning

This is a PyTorch based implementation for our NeurIPS 2020 paper on Neural Dynamic Policies for end-to-end sensorimotor learning.

Shikhar Bahl 47 Dec 11, 2022
Medical Image Segmentation using Squeeze-and-Expansion Transformers

Medical Image Segmentation using Squeeze-and-Expansion Transformers Introduction This repository contains the code of the IJCAI'2021 paper 'Medical Im

askerlee 172 Dec 20, 2022
Computer vision - fun segmentation experience using classic and deep tools :)

Computer_Vision_Segmentation_Fun Segmentation of Images and Video. Tools: pytorch Models: Classic model - GrabCut Deep model - Deeplabv3_resnet101 Flo

Mor Ventura 1 Dec 18, 2021
Prompts - Read a textfile of prompts and import into anki via ankiconnect

prompts read a textfile of prompts and import into anki via ankiconnect Usage In

Alexander Cobleigh 2 Jul 28, 2022
CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images

Code and result about CCAFNet(IEEE TMM) 'CCAFNet: Crossflow and Cross-scale Adaptive Fusion Network for Detecting Salient Objects in RGB-D Images' IEE

zyrant丶 14 Dec 29, 2021
Code for CoMatch: Semi-supervised Learning with Contrastive Graph Regularization

CoMatch: Semi-supervised Learning with Contrastive Graph Regularization (Salesforce Research) This is a PyTorch implementation of the CoMatch paper [B

Salesforce 107 Dec 14, 2022
Trains an agent with stochastic policy gradient ascent to solve the Lunar Lander challenge from OpenAI

Introduction This script trains an agent with stochastic policy gradient ascent to solve the Lunar Lander challenge from OpenAI. In order to run this

Momin Haider 0 Jan 02, 2022
General purpose Slater-Koster tight-binding code for electronic structure calculations

tight-binder Introduction General purpose tight-binding code for electronic structure calculations based on the Slater-Koster approximation. The code

9 Dec 15, 2022