NucleiDetectron
7th place solution for the 2018 Data Science Bowl with a score of 0.591 with a single model submission.
Nuclei Detectron is built on Detectorn which is a Mask R-CNN based solution. This approach was chosen since Mask R-CNN works well on instance segmentation problems.
P.S. Unofficial score of 0.608 without ensemble on submissions that were not in the 2 final submissions.
The challenge is to build a model that can identify a range of nuclei across varied conditions. The hard part of the challenge is to generalize across nuclei types that are very different from training set, and in different lighting conditions.
There were several nuclei datasets with outlines as annotations.
Mark an outer contour to encompass contours that are at image edges.
then do cv2.findContours to get the polygons of mask.
Applied classical computer vision techniques to convert ground truth from outlines to masks.
This involved adding boundary pixels to the image so all contours are closed.
Given outlines of cells with overlaps/touching or at border,
Standardized all datasets into COCO mask RLE JSON file format.
You can use cocoapi to load the annotations.
Cut image into tiles when images are bigger than 1000 pixels
This was necessary since large image features did not fit in GPU memory.
Cluster images into classes based on the color statistics.
Normalize classes size
Oversample/undersample images from clusters to a constant number of images per class in each epoch.
Fill holes in masks
Split nuclei masks that are fused
Applied morphological Erosion and Dilation to seperate fused cells
Use statistics of nuclie sizes in an image to find outliers
Zero mean unit variance normalization
Grey scale: Color-to-Grayscale: Does the Method Matter in Image Recognition.
Very important how you convert to grey scale. Many algorithms for the conversion, loss of potential data.
Luminous
Intensity
Value: This is the method I used.
Data augmentation is one of the key to achieve good generalization in this challenge.
Invert
This augmentation helped in reducing generalization error significantly
Randomly choosing to invert caused the models to generalize across all kids of backgrounds in the local validation set.
Geometric
This is very useful to make the circular looking cells to look stretched
PerspectiveTransform
PiecewiseAffine
Flip
Rotate (0, 90, 180, 270)
Crop
Alpha blending
Create geometrical blur by affine operation
Shear, rotate, translate, scale
Pixel
AddToHueAndSaturation
Multiply
Dropout, CoarseDropout
ContrastNormalization
Noise
AdditiveGaussianNoise
SimplexNoiseAlpha
FrequencyNoiseAlpha
Blur
GaussianBlur
AverageBlur
MedianBlur
BilateralBlur
Texture
Superpixels
Sharpen
Emboss
EdgeDetect
DirectedEdgeDetect
ElasticTransformation
Invert: Have improved the performance a lot
Multiple Scales 900, 1000, 1100
Flip left right
Detectron network configuration changes from the baseline e2e_mask_rcnn_X-152-32x8d-FPN-IN5k_1.44x.yaml are:
Create small anchor sizes for small nuclei. RPN_ANCHOR_START_SIZE: 8 # default 32
Add more aspect rations for nuclei that are close but in cylindrical structure. RPN_ASPECT_RATIOS: (0.2, 0.5, 1, 2, 5)
Increase the ROI resolution. ROI_XFORM_RESOLUTION: 14
Increase the number of detections per image from default 100. DETECTIONS_PER_IM: 500
Decreased warmup fraction to 0.01
Increased warmup iterations to 10,000
Gave mask loss more weight WEIGHT_LOSS_MASK: 1.2
Threshold on area to remove masks below area of 15 pixels
Threshold on BBox confidence of 0.9
Mask NMS
On decreasing order of confidence, simple union-mask strategy to remove overlapping segments or cut segments at overlaps if overlap is below 30% of the mask.
Inversion in augmentation
Blurring and frequency noise
Additional datasets, even though they caused a drop on the public leaderboard, I noticed no drop in local validation set.
Mask dilations and erosions
This did not have any improvement in the segmentation in my experiments
Use contour approximations in place of original masks
This did not have any improvement either. Maybe this could add a boost if using light augmentations.
Randomly apply structuring like open-close
Soft NMS thresh
Did not improve accuracy
Color images
Did not perform as well as grey images after augmentations
Color style transfer. Take a source image and apply the color style to target image.
Style transfer: Was losing a lot of details on some nuclei but looked good on very few images.
Dilation of masks in post processing, this drastically increased error because the model masks are already good.
Distance transform and split masks during training.
Ensemble multiple Mask R-CNN's
Two stage predictions with U-Net after box proposals.
Augmentation smoothing during training
Increase the noise and augmentation slowly during the training phase, like from 10% to 50%
Reduce the augmentation from 90% to 20% during training, for generalization and fitting.
Experiment with different levels of augmentation individually across, noise, blur, texture, alpha blending.
Different layer normalization techniques, with batch size more than one image at a time. Need bigger GPU.
Little bit of hyperparameter search on thresholds and network architecture.
U-Net with watershed, did not think this approach would outperform Mask R-CNN
For basic host setup of Nvidia driver and Nvidia-Docker go to setup.sh
. Please find installation instructions for Caffe2 and Detectron in INSTALL.md
.
chmod +x bin/nuclei/train.sh && ./bin/nuclei/train.sh -e 1_aug_gray_0_5_0 -v 1_aug_gray_0_5_0 -g 1 &chmod +x bin/nuclei/test.sh && ./bin/nuclei/test.sh -e 1_aug_gray_1_5_1_stage_2_v1 -v 1_aug_gray_1_5_1_stage_2_v1 -g 1 &tail -f /detectron/lib/datasets/data/logs/test_log python lib/datasets/nuclei/write_submission.py --results-root /detectron/lib/datasets/data/results/ --run-version '1_aug_gray_1_5_1_stage_2_v1' --iters '65999' --area-thresh 15 --acc-thresh 0.9 --intersection-thresh 0.3
Detectron. Ross Girshick and Ilija Radosavovic. Georgia Gkioxari. Piotr Doll'{a}r. Kaiming He. Github, Jan. 2018.
Image augmentation for machine learning experiments. Alexander Jung. Github, Jan. 2015.
Normalizing brightfield, stained and fluorescence. Kevin Mader. Kaggle Notebook, Apr. 2018.
Fast, tested RLE and input routines. Sam Stainsby. Kaggle Notebook, Apr. 2018.
Example Metric Implementation. William Cukierski. Kaggle Notebook, Apr. 2018.
A collection of datasets converted into COCO segmentation format.
Resized few images
Tiled some images with lot of annotations to fit in memory
Extracted masks when only outlines were available
This is done by finding contours
DATASETS = { 'nuclei_stage1_train': { IM_DIR: _DATA_DIR + '/Nuclei/stage_1_train', ANN_FN: _DATA_DIR + '/Nuclei/annotations/stage1_train.json' }, 'nuclei_stage_1_local_train_split': { IM_DIR: _DATA_DIR + '/Nuclei/stage_1_train', ANN_FN: _DATA_DIR + '/Nuclei/annotations/stage_1_local_train_split.json' }, 'nuclei_stage_1_local_val_split': { IM_DIR: _DATA_DIR + '/Nuclei/stage_1_train', ANN_FN: _DATA_DIR + '/Nuclei/annotations/stage_1_local_val_split.json' }, 'nuclei_stage_1_test': { IM_DIR: _DATA_DIR + '/Nuclei/stage_1_test', ANN_FN: _DATA_DIR + '/Nuclei/annotations/stage_1_test.json' }, 'nuclei_stage_2_test': { IM_DIR: _DATA_DIR + '/Nuclei/stage_2_test', ANN_FN: _DATA_DIR + '/Nuclei/annotations/stage_2_test.json' }, 'cluster_nuclei': { IM_DIR: _DATA_DIR + '/Nuclei/cluster_nuclei', ANN_FN: _DATA_DIR + '/Nuclei/annotations/cluster_nuclei.json' }, 'BBBC007': { IM_DIR: _DATA_DIR + '/Nuclei/BBBC007', ANN_FN: _DATA_DIR + '/Nuclei/annotations/BBBC007.json' }, 'BBBC006': { IM_DIR: _DATA_DIR + '/Nuclei/BBBC006', ANN_FN: _DATA_DIR + '/Nuclei/annotations/BBBC006.json' }, 'BBBC018': { IM_DIR: _DATA_DIR + '/Nuclei/BBBC018', ANN_FN: _DATA_DIR + '/Nuclei/annotations/BBBC018.json' }, 'BBBC020': { IM_DIR: _DATA_DIR + '/Nuclei/BBBC020', ANN_FN: _DATA_DIR + '/Nuclei/annotations/BBBC020.json' }, 'nucleisegmentationbenchmark': { IM_DIR: _DATA_DIR + '/Nuclei/nucleisegmentationbenchmark', ANN_FN: _DATA_DIR + '/Nuclei/annotations/nucleisegmentationbenchmark.json' }, '2009_ISBI_2DNuclei': { IM_DIR: _DATA_DIR + '/Nuclei/2009_ISBI_2DNuclei', ANN_FN: _DATA_DIR + '/Nuclei/annotations/2009_ISBI_2DNuclei.json' }, 'nuclei_partial_annotations': { IM_DIR: _DATA_DIR + '/Nuclei/nuclei_partial_annotations', ANN_FN: _DATA_DIR + '/Nuclei/annotations/nuclei_partial_annotations.json' }, 'TNBC_NucleiSegmentation': { IM_DIR: _DATA_DIR + '/Nuclei/TNBC_NucleiSegmentation', ANN_FN: _DATA_DIR + '/Nuclei/annotations/TNBC_NucleiSegmentation.json' }, }
import jsonfrom pathlib import Pathimport numpy as npfrom PIL import Imagefrom pycocotools import mask as mask_utilROOT_DIR = Path('/media/gangadhar/DataSSD1TB/ROOT_DATA_DIR/')DATASET_WORKING_DIR = ROOT_DIR / 'Nuclei'annotations_file = DATASET_WORKING_DIR / 'annotations/stage1_train.json'COCO = json.load(open(annotations_file.as_posix())) image_metadata = COCO['images'][0]print image_metadata# {u'file_name': u'4ca5081854df7bbcaa4934fcf34318f82733a0f8c05b942c2265eea75419d62f.jpg',# u'height': 256,# u'id': 0,# u'nuclei_class': u'purple_purple_320_256_sparce',# u'width': 320}def get_masks(im_metadata): image_annotations = [] for annotation in COCO['annotations']: if annotation['image_id'] == im_metadata['id']: image_annotations.append(annotation) segments = [annotation['segmentation'] for annotation in image_annotations] masks = mask_util.decode(segments) return masks masks = get_masks(image_metadata)print masks.shape# (256, 320, 37)def show(i): i = np.asarray(i, np.float) m,M = i.min(), i.max() I = np.asarray((i - m) / (M - m + 0.000001) * 255, np.uint8) Image.fromarray(I).show() show(np.sum(masks, -1))# this should show an image with all masks
2018 Data Science Bowl: Find the nuclei in divergent images to advance medical discovery. Competition, Kaggle, Apr. 2018. Download: https://www.kaggle.com/c/data-science-bowl-2018/data
2018 Data Science Bowl: Kaggle Data Science Bowl 2018 dataset fixes. Konstantin Lopuhin, Apr. 2018. Download: https://github.com/lopuhin/kaggle-dsbowl-2018-dataset-fixes
TNBC_NucleiSegmentation: A dataset for nuclei segmentation based on Breast Cancer patients. Naylor Peter Jack; Walter Thomas; Laé Marick; Reyal Fabien. 2018. Download: https://zenodo.org/record/1175282/files/TNBC_NucleiSegmentation.zip
A Dataset and a Technique for Generalized Nuclear Segmentation for Computational Pathology. Kumar N, Verma R, Sharma S, Bhargava S, Vahadane A, Sethi A. 2017. Download: https://nucleisegmentationbenchmark.weebly.com/dataset.html
Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases. Andrew Janowczyk, Anant Madabhushi. 2016. Download: http://andrewjanowczyk.com/wp-static/nuclei.tgz
Nuclei Dataset: Include 52 images of 200x200 pixels. Jie Shu, Guoping Qiu, Mohammad Ilyas. Immunohistochemistry (IHC) Image Analysis Toolbox, Jan. 2015. Download: https://www.dropbox.com/s/9knzkp9g9xt6ipb/cluster%20nuclei.zip?dl=0
BBBC006v1: image set available from the Broad Bioimage Benchmark Collection. Vebjorn Ljosa, Katherine L Sokolnicki & Anne E Carpenter. 2012. Download: https://data.broadinstitute.org/bbbc/BBBC007/
BBBC007v1: image set available from the Broad Bioimage Benchmark Collection. Vebjorn Ljosa, Katherine L Sokolnicki & Anne E Carpenter. 2012. Download: https://data.broadinstitute.org/bbbc/BBBC006/
BBBC018v1: image set available from the Broad Bioimage Benchmark Collection. Vebjorn Ljosa, Katherine L Sokolnicki & Anne E Carpenter. 2012. Download: https://data.broadinstitute.org/bbbc/BBBC018/
BBBC020v1: image set available from the Broad Bioimage Benchmark Collection. Vebjorn Ljosa, Katherine L Sokolnicki & Anne E Carpenter. 2012. Download: https://data.broadinstitute.org/bbbc/BBBC020/
Nuclei Segmentation In Microscope Cell Images: A Hand-Segmented Dataset And Comparison Of Algorithms. L. P. Coelho, A. Shariff, and R. F. Murphy. Proceedings of the 2009 IEEE International Symposium on Biomedical Imaging (ISBI 2009), pp. 518-521, 2009. Download: http://murphylab.web.cmu.edu/data/2009_ISBI_2DNuclei_code_data.tgz
还没有评论,说两句吧!
热门资源
Keras-ResNeXt
Keras ResNeXt Implementation of ResNeXt models...
seetafaceJNI
项目介绍 基于中科院seetaface2进行封装的JAVA...
spark-corenlp
This package wraps Stanford CoreNLP annotators ...
capsnet-with-caps...
CapsNet with capsule-wise convolution Project ...
inferno-boilerplate
This is a very basic boilerplate example for pe...
智能在线
400-630-6780
聆听.建议反馈
E-mail: support@tusaishared.com