Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Yitong Sun, Anna Gilbert, Ambuj Tewari

We propose random ReLU features models in this work. Its motivation is rooted in both kernel methods and neural networks. We prove the universality and generalization performance of random ReLU features. Parallel to Barron's theorem, we consider the ReLU feature class, extended from the reproducing kernel Hilbert space of random ReLU features, and prove a strong quantitative approximation theorem, where both inner weights and outer weights of the the neural network with ReLU nodes as an approximator are bounded by constants. We also prove a similar approximation theorem for composition of functions in ReLU feature class by multi-layer ReLU networks. Separation theorem between ReLU feature class and their composition is proved as a consequence of separation between shallow and deep networks. These results reveal nice properties of ReLU nodes from the view of approximation theory, providing support for regularization on weights of ReLU networks and for the use of random ReLU features in practice. Our experiments confirm that the performance of random ReLU features is comparable with random Fourier features.

Via

Anna Gilbert, Ambuj Tewari, Yitong Sun

We prove that, under low noise assumptions, the support vector machine with $N\ll m$ random features (RFSVM) can achieve the learning rate faster than $O(1/\sqrt{m})$ on a training set with $m$ samples when an optimized feature map is used. Our work extends the previous fast rate analysis of random features method from least square loss to 0-1 loss. We also show that the reweighted feature selection method, which approximates the optimized feature map, helps improve the performance of RFSVM in experiments on a synthetic data set.

Via