Whale Detector for Kaggle's Right Whale Recognition Challenge
Kaggle'sNOAA Right Whale Recognition Challenge
aims to develop an algorithm to identify individuals of Right Whales,
which are critically endangered. It is a great chance to study machine
learning and digital image processing although looks to me as a really
hard challenge. Anyway I've developed this method to detect the whale in
the photograph and I'm releasing it in a hope that it may help others.
It takes advantage of the fact that most pictures are pretty plain,
with almost all of the area covered by water, and have a smaller region
of interest which corresponds to the whale, so the histogram
for most of the image will be similar except on the region of
interest. The algorithm looks recursively to subimages that have an HSV
histogram not similar to the original image's histogram, marking those
regions in white and else on black. Then searches for the biggest
continuous region using contours and places a bounding box around it,
assuming it's the whale. The image is called "extract" and is saved
along the black & white mask.
Uses Python 2.7 and OpenCV 3.0.
Original Image:
Whale found:
Areas found mask:
ROI Mask:
ROI Extract:
Running with docker, Python
The jupyter version (hist_zones.ipynb) works well with a docker image that contains OpenCV 3 and Python 3 as described here.
Just modify the last line of the Dockerfile to "CMD
/usr/local/bin/jupyter-notebook --ip=0.0.0.0 --allow-root" to for root
access.