Abstract
Graph-based semi-supervised learning is very important for many classification tasks, but most existing methods assume that all labelled nodes are randomly sampled. With the presence of nonignorable nonresponse, ignoring all missing nodes can lead to significant estimation bias and handicap the classifiers. To solve this issue, we propose a Graph-based joint model with Nonignorable Missingness (GNM) and develop an imputation and inverse probability weighting estimation approach. We further use graphical neural networks to model nonlinear link functions and then use a gradient descent (GD) algorithm to estimate all the parameters of GNM. We prove the identifiability of the GNM model and validate their predictive performance in both simulations and real data analysis through comparing with models ignoring or misspecifying the missingness mechanism. Our method can achieve up to 7.5% improvement than the baseline model for the document classification task on the Cora dataset.