Alert button
Picture for Adam R Klivans

Adam R Klivans

Alert button

Learning Narrow One-Hidden-Layer ReLU Networks

Apr 20, 2023
Sitan Chen, Zehao Dou, Surbhi Goel, Adam R Klivans, Raghu Meka

Figure 1 for Learning Narrow One-Hidden-Layer ReLU Networks
Figure 2 for Learning Narrow One-Hidden-Layer ReLU Networks

We consider the well-studied problem of learning a linear combination of $k$ ReLU activations with respect to a Gaussian distribution on inputs in $d$ dimensions. We give the first polynomial-time algorithm that succeeds whenever $k$ is a constant. All prior polynomial-time learners require additional assumptions on the network, such as positive combining coefficients or the matrix of hidden weight vectors being well-conditioned. Our approach is based on analyzing random contractions of higher-order moment tensors. We use a multi-scale analysis to argue that sufficiently close neurons can be collapsed together, sidestepping the conditioning issues present in prior work. This allows us to design an iterative procedure to discover individual neurons.

* 33 pages, comments welcome 
Viaarxiv icon

Efficiently Learning Any One Hidden Layer ReLU Network From Queries

Nov 08, 2021
Sitan Chen, Adam R Klivans, Raghu Meka

Model extraction attacks have renewed interest in the classic problem of learning neural networks from queries. In this work we give the first polynomial-time algorithm for learning arbitrary one hidden layer neural networks activations provided black-box access to the network. Formally, we show that if $F$ is an arbitrary one hidden layer neural network with ReLU activations, there is an algorithm with query complexity and running time that is polynomial in all parameters that outputs a network $F'$ achieving low square loss relative to $F$ with respect to the Gaussian measure. While a number of works in the security literature have proposed and empirically demonstrated the effectiveness of certain algorithms for this problem, ours is the first with fully polynomial-time guarantees of efficiency even for worst-case networks (in particular our algorithm succeeds in the overparameterized setting).

* To appear in Advances in Neural Information Processing Systems (NeurIPS 2021) 
Viaarxiv icon