Abstract This paper presents a salient object detection method that integrates both top-down and bottom-up saliency inference in an iterative and cooperative manner. The topdown process is used for coarse-to-fifine saliency estimation, where high-level saliency is gradually integrated with fifiner lower-layer features to obtain a fifine-grained result. The bottom-up process infers the high-level, but rough saliency through gradually using upper-layer, semantically-richer features. These two processes are alternatively performed, where the bottom-up process uses the fifine-grained saliency obtained from the top-down process to yield enhanced highlevel saliency estimate, and the top-down process, in turn, is further benefifited from the improved high-level information. The network layers in the bottom-up/top-down processes are equipped with recurrent mechanisms for layerwise, step-by-step optimization. Thus, saliency information is effectively encouraged to flflow in a bottom-up, top-down and intra-layer manner. We show that most other saliency models based on fully convolutional networks (FCNs) are essentially variants of our model. Extensive experiments on several famous benchmarks clearly demonstrate the superior performance, good generalization, and powerful learning ability of our proposed saliency inference framework.