Abstract
Recent advances in network embedding have revolutionized the field of graph and network mining. However, (pre-)training embeddings for very
large-scale networks is computationally challenging for most existing methods. In this work, we
present ProNE1—a fast, scalable, and effective
model, whose single-thread version is 10–400×
faster than efficient network embedding benchmarks with 20 threads, including LINE, DeepWalk,
node2vec, GraRep, and HOPE. As a concrete example, the single-thread ProNE requires only 29
hours to embed a network of hundreds of millions
of nodes while it takes LINE weeks and DeepWalk months by using 20 threads. To achieve
this, ProNE first initializes network embeddings ef-
ficiently by formulating the task as sparse matrix
factorization. The second step of ProNE is to enhance the embeddings by propagating them in the
spectrally modulated space. Extensive experiments
on networks of various scales and types demonstrate that ProNE achieves both effectiveness and
significant efficiency superiority when compared to
the aforementioned baselines. In addition, ProNE’s
embedding enhancement step can be also generalized for improving other models at speed, e.g., offering >10% relative gains for the used baselines