In this paper, we consider a general non-convex constrained optimization problem, where the objective function is weakly convex and the constraint function is convex while they can both be non-smooth. This class of problems arises from many applications in machine learning such as fairness-aware supervised learning. To solve this problem, we consider the classical switching subgradient method by Polyak (1965), which is an intuitive and easily implementable first-order method. Before this work, its iteration complexity was only known for convex optimization. We prove its oracle complexity for finding a nearly stationary point when the objective function is non-convex. The analysis is derived separately when the constraint function is deterministic and stochastic. Compared to existing methods, especially the double-loop methods, the switching gradient method can be applied to non-smooth problems and only has a single loop, which saves the effort on tuning the number of inner iterations.
We propose a federated learning method with weighted nodes in which the weights can be modified to optimize the model's performance on a separate validation set. The problem is formulated as a bilevel optimization where the inner problem is a federated learning problem with weighted nodes and the outer problem focuses on optimizing the weights based on the validation performance of the model returned from the inner problem. A communication-efficient federated optimization algorithm is designed to solve this bilevel optimization problem. Under an error-bound assumption, we analyze the generalization performance of the output model and identify scenarios when our method is in theory superior to training a model only locally and to federated learning with static and evenly distributed weights.