pytorch-mask-rcnn A pytorch implementation of Mask RCNN detection framework based on
This project supports single-GPU training of ResNet101-based Mask R-CNN (without FPN support). The purpose is to support the experiments in MAttNet , whose REFER dataset is a subset of COCO training portion. Thus our pre-trained model takes COCO_2014_train_minus_refer_valtest + COCO_2014_valminusminival images for training.
PrerequisitesPython 2.7
Pytorch 0.2 or higher
CUDA 8.0 or higher
requirements.txt
PreparationFirst of all, clone the code with refer API :
git clone --recursive https://github.com/lichengunc/mask-faster-rcnn Prepare data:
git clone https://github.com/cocodataset/cocoapi data/coco git clone https://github.com/lichengunc/refer data/refer ImageNet Weights : Find the resnet101-caffe download link from this repository , and download it as data/imagenet_weights/res101.pth
.
coco_minus_refer : Make the coco_minus_refer
annotation, which is to be saved as data/coco/annotations/instances_train_minus_refer_valtest2014.json
python tools/make_coco_minus_refer_instances.py CompilationAs pointed out by ruotianluo/pytorch-faster-rcnn , choose the right -arch
to compile the cuda code:
GPU model Architecture TitanX (Maxwell/Pascal) sm_52 GTX 960M sm_50 GTX 1080 (Ti) sm_61 Grid K520 (AWS g2.2xlarge) sm_30 Tesla K80 (AWS p2.xlarge) sm_37
Compile the CUDA-based nms
and roi_pooling
using following simple commands:
cd lib
make TrainingRun by (notime
as extra/tag)
./experiments/scripts/train_mask_rcnn_notime.sh 0 refcoco res101 notime Checking the training process by calling tensorboard, and check it at server.cs.unc.edu:port_number
tensorboard --logdir tensorboard/res101 --port=port_number EvaluationRun by (notime
as extra/tag)
./experiments/scripts/test_mask_rcnn_notime.sh 0 refcoco res101 notime Detection Comparison:
Detection AP AP50 AP75 Faster R-CNN 34.1 53.7 36.8 Our Mask R-CNN 35.8 55.3 38.6
Segmentation Comparison:
We have fewer (~6,500) training images.
Our training is single GPU.
The shorter border length in our model is 600px instead of 800px.
Segmentation AP AP50 AP75 Original Mask R-CNN 32.7 54.2 34.0 Our Mask R-CNN 30.7 52.3 32.4
Pretrained ModelWe provide the model we used in MAttNet for mask comprehension.
Download and put the downloaded .pth
and .pkl
files into output/res101/coco_2014_train_minus_refer_valtest+coco_2014_valminusminival/notime
Demo