Abstract
Human eye fixations often correlate with locations ofsalient objects in the scene. However, only a handful ofapproaches have attempted to simultaneously address therelated aspects of eye fixations and object saliency. Inthis work, we propose a deep convolutional neural network (CNN) capable of predicting eye fixations and segmenting salient objects in a unified framework. We design the initial network layers, shared between both the tasks, such that they capture the object level semantics and the globalcontextual aspects of saliency, while the deeper layers of the network address task specific aspects. In addition, our network captures saliency at multiple scales via inceptionstyle convolution blocks. Our network shows a significant improvement over the current state-of-the-art for both eye fixation prediction and salient object segmentation across a number of challenging datasets.