Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:SWAT-NN: Simultaneous Weights and Architecture Training for Neural Networks in a Latent Space

Jun 11, 2025

Zitong Huang, Mansooreh Montazerin, Ajitesh Srivastava

Share this with someone who'll enjoy it:

Abstract:Designing neural networks typically relies on manual trial and error or a neural architecture search (NAS) followed by weight training. The former is time-consuming and labor-intensive, while the latter often discretizes architecture search and weight optimization. In this paper, we propose a fundamentally different approach that simultaneously optimizes both the architecture and the weights of a neural network. Our framework first trains a universal multi-scale autoencoder that embeds both architectural and parametric information into a continuous latent space, where functionally similar neural networks are mapped closer together. Given a dataset, we then randomly initialize a point in the embedding space and update it via gradient descent to obtain the optimal neural network, jointly optimizing its structure and weights. The optimization process incorporates sparsity and compactness penalties to promote efficient models. Experiments on synthetic regression tasks demonstrate that our method effectively discovers sparse and compact neural networks with strong performance.

View paper on

Share this with someone who'll enjoy it:

Title:SWAT-NN: Simultaneous Weights and Architecture Training for Neural Networks in a Latent Space

Paper and Code