资源算法detectron-self-train

detectron-self-train

2020-03-05 | |  38 |   0 |   0

PyTorch-Detectron for domain adaptation by self-training on hard examples

图片.png

This codebase replicates results for pedestrian detection with domain shifts on the BDD100k dataset, following the CVPR 2019 paper Automatic adaptation of object detectors to new domains using self-training. We provide trained models, train and eval scripts as well as splits of the dataset for download. More details are available on the project page.

This repository is heavily based off A Pytorch Implementation of Detectron. We modify it for experiments on domain adaptation of face and pedestrian detectors.

If you find this codebase useful, please consider citing:

@inproceedings{roychowdhury2019selftrain,
    Author = {Aruni RoyChowdhury and Prithvijit Chakrabarty  and Ashish Singh and SouYoung Jin and Huaizu Jiang and Liangliang Cao and Erik Learned-Miller},
    Title = {Automatic adaptation of object detectors to new domains using self-training},
    Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
    Year = {2019}
}

Getting Started

Clone the repo:

git clone git@github.com:AruniRC/detectron-self-train.git

Requirements

Tested under python3.

  • python packages

    • pytorch>=0.3.1

    • torchvision>=0.2.0

    • cython

    • matplotlib

    • numpy

    • scipy

    • opencv

    • pyyaml

    • packaging

    • pycocotools — for COCO dataset, also available from pip.

    • tensorboardX — for logging the losses in Tensorboard

  • An NVIDAI GPU and CUDA 8.0 or higher. Some operations only have gpu implementation.

  • NOTICE: different versions of Pytorch package have different memory usages.

Installation

This walkthrough describes setting up this Detectron repo. The detailed instructions are in INSTALL.md.

Dataset

Create a data folder under the repo,

cd {repo_root}
mkdir data

BDD-100k

Our pedestrian detection task uses both labeled and unlabeled data from the Berkeley Deep Drive BDD-100k dataset. Please register and download the dataset from their website. We use a symlink from our project root, data/bdd100k to link to the location of the downloaded dataset. The folder structure should be like this:

data/bdd100k/
    images/
        test/
        train/
        val/
    labels/
        train/
        val/

BDD-100k takes about 6.5 GB disk space. The 100k unlabeled videos take 234 GB space, but you do not need to download them, since we have already done the hard example mining on these and the extracted frames (+ pseudo-labels) are available for download.

BDD Hard Examples

Mining the hard positives ("HPs") involve detecting pedestrians and tracklet formation on 100K videos. This was done on the UMass GPU Cluster and took about a week. We do not include this pipeline here (yet) -- the mined video frames and annotations are available for download as a gzipped tarball from hereNOTE: this is a large download (23 GB). The data retains the permissions and licensing associated with the BDD-100K dataset (we make the video frames available here for ease of research).

Now we create a symlink to the untarred BDD HPs from the project data folder, which should have the following structure: data/bdd_peds_HP18k/*.jpg. The image naming convention is <video-name>_<frame-number>.jpg.

Annotation JSONs

All the annotations are assumed to be downloaded inside a folder data/bdd_jsons relative to the project root: data/bdd_jsons/*.json. We use symlinks here as well, in case the JSONs are kept in some other location.

Data SplitJSONDataset nameImage Dir.
BDD_Source_Trainbdd_peds_train.jsonbdd_peds_traindata/bdd100k
BDD_Source_Valbdd_peds_val.jsonbdd_peds_valdata/bdd100k
BDD_Target_Trainbdd_peds_not_clear_any_daytime_train.jsonbdd_peds_not_clear_any_daytime_traindata/bdd100k
BDD_Target_Valbdd_peds_not_clear_any_daytime_val.jsonbdd_peds_not_clear_any_daytime_valdata/bdd100k
BDD_detsbdd_dets18k.jsonDETS18kdata/bdd_peds_HP18k
BDD_HPbdd_HP18k.jsonHP18kdata/bdd_peds_HP18k
BDD_score_remapbdd_HP18k_remap_hist.jsonHP18k_remap_histdata/bdd_peds_HP18k
BDD_target_GTbdd_target_labeled.jsonbdd_peds_not_clear_any_daytime_train_100data/bdd100k

Models

Use the environment variable CUDA_VISIBLE_DEVICES to control which GPUs to use. All the training scripts are run with 4 GPUs. The trained model checkpoints can be downloaded from the links under the column Model weights. The eval scripts need to be modified to point to where the corresponding model checkpoints have been downloaded locally. To be consistent, we suggest creating a folder under the project root like data/bdd_pre_trained_models and saving all the models under it.

The performance numbers shown are from single models (the same models available for download), while the tables in the paper show results averaged across 5 rounds of train/test.

MethodModel weightsConfig YAMLTrain scriptEval scriptAP, AR
Baselinebdd_baselinecfgtraineval15.21, 33.09
Detsbdd_detscfgtraineval27.55, 56.90
HPbdd_hpcfgtraineval28.34, 58.04
HP-constrainedbdd_hp-conscfgtraineval29.5756.48
HP-score-remapbdd_score-remapcfgtraineval28.11, 56.80
DA-imbdd_da-imcfgtraineval25.71, 56.29
Src-Target-GTbdd_target-gtcfgtraineval35.40, 66.26

Inference demo

图片.png

The folder gypsum/scripts/demo contains two shell scripts that run the pre-trained Baseline (BDD-Source trained) and HP-constrained (domain adapted to BDD Target) models on a sample image. Please change the MODEL_PATH variable in these scripts to where the appropriate models have been downloaded locally. Your results should resemble the example shown above. Note that the domain adapted model (HP-constrained) detects pedestrians with higher confidence (visualization threshold is 0.9 on the confidence score), while making one false positive in the background.

Acknowledgement

This material is based on research sponsored by the AFRL and DARPA under agreement num-ber FA8750-18-2-0126. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the AFRL and DARPA or the U.S. Government. We acknowledge support from the MassTech Collaborative grant for funding the UMass GPU cluster. We thank Tsung-Yu Lin and Subhransu Maji for helpful discussions.

We appreciate the well-organized and accurate codebase for the Detectron implementation in PyTorch from the creators of A Pytorch Implementation of Detectron. Also thanks to the creators of BDD-100k which has allowed us to share our pseudo-labeled video frames for our academic, non-commercial purpose of quickly reproducing results.


上一篇:detectron2_instance_segmentation_demo

下一篇:Detectron.pytorch

用户评价
全部评价

热门资源

  • Keras-ResNeXt

    Keras ResNeXt Implementation of ResNeXt models...

  • seetafaceJNI

    项目介绍 基于中科院seetaface2进行封装的JAVA...

  • spark-corenlp

    This package wraps Stanford CoreNLP annotators ...

  • capsnet-with-caps...

    CapsNet with capsule-wise convolution Project ...

  • inferno-boilerplate

    This is a very basic boilerplate example for pe...