Abstract
Motivated by extremely large-scale machine learning problems, we introduce a new multistage algorithmic framework for submodular maximization (called M ULT G REED), where at each stage we apply an approximate greedy procedure to maximize surrogate submodular functions. The surrogates serve as proxies for a target submodular function but require less memory and are easy to evaluate. We theoretically analyze the performance guarantee of the multi-stage framework and give examples on how to design instances of M ULT G REED for a broad range of natural submodular functions. We show that M ULT G REED performs very closely to the standard greedy algorithm given appropriate surrogate functions and argue how our framework can easily be integrated with distributive algorithms for further optimization. We complement our theory by empirically evaluating on several real-world problems, including data subset selection on millions of speech samples where M ULT G REED yields at least a thousand times speedup and superior results over the state-of-the-art selection methods