Understanding the training dynamics of deep learning models is perhaps a necessary step toward demystifying the effectiveness of these models. In particular, how do data from different classes gradually become separable in their feature spaces when training neural networks using stochastic gradient descent? In this study, we model the evolution of features during deep learning training using a set of stochastic differential equations (SDEs) that each corresponds to a training sample. As a crucial ingredient in our modeling strategy, each SDE contains a drift term that reflects the impact of backpropagation at an input on the features of all samples. Our main finding uncovers a sharp phase transition phenomenon regarding the {intra-class impact: if the SDEs are locally elastic in the sense that the impact is more significant on samples from the same class as the input, the features of the training data become linearly separable, meaning vanishing training loss; otherwise, the features are not separable, regardless of how long the training time is. Moreover, in the presence of local elasticity, an analysis of our SDEs shows that the emergence of a simple geometric structure called the neural collapse of the features. Taken together, our results shed light on the decisive role of local elasticity in the training dynamics of neural networks. We corroborate our theoretical analysis with experiments on a synthesized dataset of geometric shapes and CIFAR-10.
This paper investigates terahertz ultra-massive (UM)-MIMO-based angle estimation for space-to-air communications, which can solve the performance degradation problem caused by the dual delay-beam squint effects of terahertz UM-MIMO channels. Specifically, we first design a grouping true-time delay unit module that can significantly mitigate the impact of delay-beam squint effects to establish the space-to-air THz link. Based on the subarray selection scheme, the UM hybrid array can be equivalently considered as a low-dimensional fully-digital array, and then the fine estimates of azimuth/elevation angles at both UAVs and satellite can be separately acquired using the proposed prior-aided iterative angle estimation algorithm. The simulation results that close to Cram\'{e}r-Rao lower bounds verify the effectiveness of our solution.
In deep learning with differential privacy (DP), the neural network achieves the privacy usually at the cost of slower convergence (and thus lower performance) than its non-private counterpart. This work gives the first convergence analysis of the DP deep learning, through the lens of training dynamics and the neural tangent kernel (NTK). Our convergence theory successfully characterizes the effects of two key components in the DP training: the per-sample clipping (flat or layerwise) and the noise addition. Our analysis not only initiates a general principled framework to understand the DP deep learning with any network architecture and loss function, but also motivates a new clipping method -- the global clipping, that significantly improves the convergence while preserving the same privacy guarantee as the existing local clipping. In terms of theoretical results, we establish the precise connection between the per-sample clipping and NTK matrix. We show that in the gradient flow, i.e., with infinitesimal learning rate, the noise level of DP optimizers does not affect the convergence. We prove that DP gradient descent (GD) with global clipping guarantees the monotone convergence to zero loss, which can be violated by the existing DP-GD with local clipping. Notably, our analysis framework easily extends to other optimizers, e.g., DP-Adam. Empirically speaking, DP optimizers equipped with global clipping perform strongly on a wide range of classification and regression tasks. In particular, our global clipping is surprisingly effective at learning calibrated classifiers, in contrast to the existing DP classifiers which are oftentimes over-confident and unreliable. Implementation-wise, the new clipping can be realized by adding one line of code into the Opacus library.
Over the past decade, learning a dictionary from input images for sparse modeling has been one of the topics which receive most research attention in image processing and compressed sensing. Most existing dictionary learning methods consider an over-complete dictionary, such as the K-SVD method, which may result in high mutual incoherence and therefore has a negative impact in recognition. On the other side, the sparse codes are usually optimized by adding the $\ell_0$ or $\ell_1$-norm penalty, but with no strict sparsity guarantee. In this paper, we propose an orthogonal dictionary learning model which can obtain strictly sparse codes and orthogonal dictionary with global sequence convergence guarantee. We find that our method can result in better denoising results than over-complete dictionary based learning methods, and has the additional advantage of high computation efficiency.
The emerging space-air-ground integrated network has attracted intensive research and necessitates reliable and efficient aeronautical communications. This paper investigates terahertz Ultra-Massive (UM)-MIMO-based aeronautical communications and proposes an effective channel estimation and tracking scheme, which can solve the performance degradation problem caused by the unique {\emph{triple delay-beam-Doppler squint effects}} of aeronautical terahertz UM-MIMO channels. Specifically, based on the rough angle estimates acquired from navigation information, an initial aeronautical link is established, where the delay-beam squint at transceiver can be significantly mitigated by employing a Grouping True-Time Delay Unit (GTTDU) module (e.g., the designed {\emph{Rotman lens}}-based GTTDU module). According to the proposed prior-aided iterative angle estimation algorithm, azimuth/elevation angles can be estimated, and these angles are adopted to achieve precise beam-alignment and refine GTTDU module for further eliminating delay-beam squint. Doppler shifts can be subsequently estimated using the proposed prior-aided iterative Doppler shift estimation algorithm. On this basis, path delays and channel gains can be estimated accurately, where the Doppler squint can be effectively attenuated via compensation process. For data transmission, a data-aided decision-directed based channel tracking algorithm is developed to track the beam-aligned effective channels. When the data-aided channel tracking is invalid, angles will be re-estimated at the pilot-aided channel tracking stage with an equivalent sparse digital array, where angle ambiguity can be resolved based on the previously estimated angles. The simulation results and the derived Cram\'{e}r-Rao lower bounds verify the effectiveness of our solution.
We propose using machine learning models for the direct synthesis of on-chip electromagnetic (EM) passive structures to enable rapid or even automated designs and optimizations of RF/mm-Wave circuits. As a proof of concept, we demonstrate the direct synthesis of a 1:1 transformer on a 45nm SOI process using our proposed neural network model. Using pre-existing transformer s-parameter files and their geometric design training samples, the model predicts target geometric designs.
Helpfulness prediction techniques have been widely used to identify and recommend high-quality online reviews to customers. Currently, the vast majority of studies assume that a review's helpfulness is self-contained. In practice, however, customers hardly process reviews independently given the sequential nature. The perceived helpfulness of a review is likely to be affected by its sequential neighbors (i.e., context), which has been largely ignored. This paper proposes a new methodology to capture the missing interaction between reviews and their neighbors. The first end-to-end neural architecture is developed for neighbor-aware helpfulness prediction (NAP). For each review, NAP allows for three types of neighbor selection: its preceding, following, and surrounding neighbors. Four weighting schemes are designed to learn context clues from the selected neighbors. A review is then contextualized into the learned clues for neighbor-aware helpfulness prediction. NAP is evaluated on six domains of real-world online reviews against a series of state-of-the-art baselines. Extensive experiments confirm the effectiveness of NAP and the influence of sequential neighbors on a current reviews. Further hyperparameter analysis reveals three main findings. (1) On average, eight neighbors treated with uneven importance are engaged for context construction. (2) The benefit of neighbor-aware prediction mainly results from closer neighbors. (3) Equally considering up to five closest neighbors of a review can usually produce a weaker but tolerable prediction result.
This paper reviews the NTIRE 2020 challenge on video quality mapping (VQM), which addresses the issues of quality mapping from source video domain to target video domain. The challenge includes both a supervised track (track 1) and a weakly-supervised track (track 2) for two benchmark datasets. In particular, track 1 offers a new Internet video benchmark, requiring algorithms to learn the map from more compressed videos to less compressed videos in a supervised training manner. In track 2, algorithms are required to learn the quality mapping from one device to another when their quality varies substantially and weakly-aligned video pairs are available. For track 1, in total 7 teams competed in the final test phase, demonstrating novel and effective solutions to the problem. For track 2, some existing methods are evaluated, showing promising solutions to the weakly-supervised video quality mapping problem.
At present there are many companies that take the most advanced Deep Neural Networks (DNNs) to classify and analyze photos we upload to social networks or the cloud. In order to prevent users privacy from leakage, the attack characteristics of the adversarial example can be exploited to make these models misjudged. In this paper, we take advantage of reversible image transformation to construct reversible adversarial example, which is still an adversarial example to DNNs. It not only allows DNNs to extract the wrong information, but also can be recovered to its original image without any distortion. Experimental results show that reversible adversarial examples obtained by our method have higher attack success rates while ensuring that the reversible image quality is still high. Moreover, the proposed method is easy to operate, suitable for practical applications.