Abstract
success of social media, many companies resort to social media sites for monitoring the reputation of their brands and the opinions of general public. To help companies monitor their brands, in this work, we delve into the task of extracting representative aspects and posts from users’ free-text posts in social media. Previous efforts treat it as a traditional information extraction task, and forgo the specific properties of social media, such as the possible noise in user generated posts and the varying impacts; In contrast, we extract aspects by maximizing their representativeness, which is a new notion defined by us that accounts for both the coverage of aspects and the impact of posts. We formalize it as a submodular optimization problem, and develop a FastPAS algorithm to jointly select representative posts and aspects. The FastPAS algorithm optimizes parameters in a greedy way, which is highly efficient and can reach a good solution with theoretical guarantees. Extensive experiments on two datasets demonstrate that our method outperforms state-of-the-art aspect extraction and summarization methods in identifying representative aspects.