Abstract. Interactive image segmentation is critical for many image
editing tasks. While recent advanced methods on interactive segmentation focus on the region-based paradigm, more traditional boundarybased methods such as Intelligent Scissor are still popular in practice
as they allow users to have active control of the object boundaries. Existing methods for boundary-based segmentation solely rely on low-level
image features, such as edges for boundary extraction, which limits their
ability to adapt to high-level image content and user intention. In this
paper, we introduce an interaction-aware method for boundary-based
image segmentation. Instead of relying on pre-defined low-level image
features, our method adaptively predicts object boundaries according to
image content and user interactions. Therein, we develop a fully convolutional encoder-decoder network that takes both the image and user
interactions (e.g. clicks on boundary points) as input and predicts semantically meaningful boundaries that match user intentions. Our method
explicitly models the dependency of boundary extraction results on image content and user interactions. Experiments on two public interactive
segmentation benchmarks show that our method significantly improves
the boundary quality of segmentation results compared to state-of-theart methods while requiring fewer user interactions