Abstract
The universal approximation theorem, in one of its most general versions, says that if we consider only continuous activation functions then a standard feedforward neural network with one hidden layer is able to approximate any continuous multivariate function f to any given approximation threshold if and only if is non-polynomial. In this paper, we give a direct algebraic proof of the theorem. Furthermore we shall explicitly quantify the number of hidden units required for approximation. Specifically, if X is compact, then a neuralnetwork with n input units, m output units, and a single hidden layer with hidden units (independent of m and ), can uniformly approximate any polynomial function f : X whose total degree is at most d for each of its m coordinate functions. In the general case that f is any continuous function, we show there exists some (independent of m), such that N hidden units would suffice to approximate f . We also show that this uniform approximation property (UAP) still holds even under seemingly strong conditions imposed on the weights. We highlight several consequences: > 0, the UAP still holds if we restrict all non-bias weights w in the last layer to satisfy |w| <There exists some > 0 (depending only on f and ), such that the UAP still holds if we restrict all non-bias weights w in the first layer to satisfy |w| > (iii) If the non-bias weights in the first layer are fixed and randomly chosen from a suitable range, then the UAP holds with probability 1.