Alert button
Picture for Daniil Dmitriev

Daniil Dmitriev

Alert button

Deterministic equivalent and error universality of deep random features learning

Feb 01, 2023
Dominik Schröder, Hugo Cui, Daniil Dmitriev, Bruno Loureiro

Figure 1 for Deterministic equivalent and error universality of deep random features learning
Figure 2 for Deterministic equivalent and error universality of deep random features learning
Figure 3 for Deterministic equivalent and error universality of deep random features learning
Figure 4 for Deterministic equivalent and error universality of deep random features learning

This manuscript considers the problem of learning a random Gaussian network function using a fully connected network with frozen intermediate layers and trainable readout layer. This problem can be seen as a natural generalization of the widely studied random features model to deeper architectures. First, we prove Gaussian universality of the test error in a ridge regression setting where the learner and target networks share the same intermediate layers, and provide a sharp asymptotic formula for it. Establishing this result requires proving a deterministic equivalent for traces of the deep random features sample covariance matrices which can be of independent interest. Second, we conjecture the asymptotic Gaussian universality of the test error in the more general setting of arbitrary convex losses and generic learner/target architectures. We provide extensive numerical evidence for this conjecture, which requires the derivation of closed-form expressions for the layer-wise post-activation population covariances. In light of our results, we investigate the interplay between architecture design and implicit regularization.

Viaarxiv icon

Dynamic Model Pruning with Feedback

Jun 12, 2020
Tao Lin, Sebastian U. Stich, Luis Barba, Daniil Dmitriev, Martin Jaggi

Figure 1 for Dynamic Model Pruning with Feedback
Figure 2 for Dynamic Model Pruning with Feedback
Figure 3 for Dynamic Model Pruning with Feedback
Figure 4 for Dynamic Model Pruning with Feedback

Deep neural networks often have millions of parameters. This can hinder their deployment to low-end devices, not only due to high memory requirements but also because of increased latency at inference. We propose a novel model compression method that generates a sparse trained model without additional overhead: by allowing (i) dynamic allocation of the sparsity pattern and (ii) incorporating feedback signal to reactivate prematurely pruned weights we obtain a performant sparse model in one single training pass (retraining is not needed, but can further improve the performance). We evaluate our method on CIFAR-10 and ImageNet, and show that the obtained sparse models can reach the state-of-the-art performance of dense models. Moreover, their performance surpasses that of models generated by all previously proposed pruning schemes.

* appearing at ICLR 2020 
Viaarxiv icon