Abstract. Accurate semantic image segmentation requires the joint consideration of local appearance, semantic information, and global scene context. In today’s age of pre-trained deep networks and their powerful convolutional features, state-of-the-art semantic segmentation approaches diffffer mostly in how they choose to combine together these difffferent kinds of information. In this work, we propose a novel scheme for aggregating features from difffferent scales, which we refer to as MultiScale Context Intertwining (MSCI). In contrast to previous approaches, which typically propagate information between scales in a one-directional manner, we merge pairs of feature maps in a bidirectional and recurrent fashion, via connections between two LSTM chains. By training the parameters of the LSTM units on the segmentation task, the above approach learns how to extract powerful and effffective features for pixellevel semantic segmentation, which are then combined hierarchically. Furthermore, rather than using fifixed information propagation routes, we subdivide images into super-pixels, and use the spatial relationship between them in order to perform image-adapted context aggregation. Our extensive evaluation on public benchmarks indicates that all of the aforementioned components of our approach increase the effffectiveness of information propagation throughout the network, and signifificantly improve its eventual segmentation accuracy