Abstract
We introduce the problem of grouping a finite set V into m blocks where each block is a subset of V and where: (i) the blocks are individually highly valued by a submodular function f (both robustly and in the average case) while satisfying block-specific matroid constraints; and (ii) block scores interact where blocks are jointl scored highly via f , thus making the blocks mutually non-redundant. Submodular functions are good models of information and diversity; thus, the above can be seen as grouping V into matroid constrained blocks that are both intraand inter-diverse. Potential applications include form ing ensembles of classification/regression models, partitioning data for parallel processing, and sum marization. In the non-robust case, we reduce the problem to non-monotone submodular maximization subject to multiple matroid constraints. In the mixed robust/average case, we offer a bicriterion guarantee for a polynomial time deterministic algorithm and a probabilistic guarantee for randomized algorithm, as long as the involved submodular functions (including the inter-block interaction terms) are monotone. We close with a case study in which we use these algorithms to find high quality diverse ensembles of classifiers showing good results.