Abstract
Image-based classifification of tissue histology, in terms of different components (e.g., normal signature, categories of aberrant signatures), provides a series of indices for tumor composition. Subsequently, aggregation of these indices in each whole slide image (WSI) from a large cohort can provide predictive models of clinical outcome. However, the performance of the existing techniques is hindered as a result of large technical and biological variations that are always present in a large cohort. In this paper, we propose two algorithms for classifification of tissue histology based on robust representations of morphometric context, which are built upon nuclear level morphometric features at various locations and scales within the spatial pyramid matching (SPM) framework. These methods have been evaluated on two distinct datasets of different tumor types collected from The Cancer Genome Atlas (TCGA), and the experimental results indicate that our methods are (i) extensible to different tumor types; (ii) robust in the presence of wide technical and biological variations; (iii) invariant to different nuclear segmentation strategies; and (iv) scalable with varying training sample size. In addition, our experiments suggest that enforcing sparsity, during the construction of morphometric context, further improves the performance of the system