Since proposed in the 70s, the Non-Equilibrium Green Function (NEGF) method has been recognized as a standard approach to quantum transport simulations. Although it achieves superiority in simulation accuracy, the tremendous computational cost makes it unbearable for high-throughput simulation tasks such as sensitivity analysis, inverse design, etc. In this work, we propose AD-NEGF, to our best knowledge the first end-to-end differentiable NEGF model for quantum transport simulations. We implement the entire numerical process in PyTorch, and design customized backward pass with implicit layer techniques, which provides gradient information at an affordable cost while guaranteeing the correctness of the forward simulation. The proposed model is validated with applications in calculating differential physical quantities, empirical parameter fitting, and doping optimization, which demonstrates its capacity to accelerate the material design process by conducting gradient-based parameter optimization.
Graph Neural Network (GNN) is an emerging technique for graph-based learning tasks such as node classification. In this work, we reveal the vulnerability of GNN to the imbalance of node labels. Traditional solutions for imbalanced classification (e.g. resampling) are ineffective in node classification without considering the graph structure. Worse still, they may even bring overfitting or underfitting results due to lack of sufficient prior knowledge. To solve these problems, we propose a novel graph neural network framework with curriculum learning (GNN-CL) consisting of two modules. For one thing, we hope to acquire certain reliable interpolation nodes and edges through the novel graph-based oversampling based on smoothness and homophily. For another, we combine graph classification loss and metric learning loss which adjust the distance between different nodes associated with minority class in feature space. Inspired by curriculum learning, we dynamically adjust the weights of different modules during training process to achieve better ability of generalization and discrimination. The proposed framework is evaluated via several widely used graph datasets, showing that our proposed model consistently outperforms the existing state-of-the-art methods.
The Schr\"odinger equation is at the heart of modern quantum mechanics. Since exact solutions of the ground state are typically intractable, standard approaches approximate Schr\"odinger equation as forms of nonlinear generalized eigenvalue problems $F(V)V = SV\Lambda$ in which $F(V)$, the matrix to be decomposed, is a function of its own top-$k$ smallest eigenvectors $V$, leading to a "self-consistency problem". Traditional iterative methods heavily rely on high-quality initial guesses of $V$ generated via domain-specific heuristics methods based on quantum mechanics. In this work, we eliminate such a need for domain-specific heuristics by presenting a novel framework, Self-consistent Gradient-like Eigen Decomposition (SCGLED) that regards $F(V)$ as a special "online data generator", thus allows gradient-like eigendecomposition methods in streaming $k$-PCA to approach the self-consistency of the equation from scratch in an iterative way similar to online learning. With several critical numerical improvements, SCGLED is robust to initial guesses, free of quantum-mechanism-based heuristics designs, and neat in implementation. Our experiments show that it not only can simply replace traditional heuristics-based initial guess methods with large performance advantage (achieved averagely 25x more precise than the best baseline in similar wall time), but also is capable of finding highly precise solutions independently without any traditional iterative methods.
Person re-identification (Re-ID) has achieved great success in the supervised scenario. However, it is difficult to directly transfer the supervised model to arbitrary unseen domains due to the model overfitting to the seen source domains. In this paper, we aim to tackle the generalizable multi-source person Re-ID task (i.e., there are multiple available source domains, and the testing domain is unseen during training) from the data augmentation perspective, thus we put forward a novel method, termed MixNorm, which consists of domain-aware mix-normalization (DMN) and domain-ware center regularization (DCR). Different from the conventional data augmentation, the proposed domain-aware mix-normalization to enhance the diversity of features during training from the normalization view of the neural network, which can effectively alleviate the model overfitting to the source domains, so as to boost the generalization capability of the model in the unseen domain. To better learn the domain-invariant model, we further develop the domain-aware center regularization to better map the produced diverse features into the same space. Extensive experiments on multiple benchmark datasets validate the effectiveness of the proposed method and show that the proposed method can outperform the state-of-the-art methods. Besides, further analysis also reveals the superiority of the proposed method.
The quasiparticle effective mass $m^\ast$ of interacting electrons is a fundamental quantity in the Fermi liquid theory. However, the precise value of the effective mass of uniform electron gas is still elusive after decades of research. The newly developed neural canonical transformation approach arXiv:2105.08644 offers a principled way to extract the effective mass of electron gas by directly calculating the thermal entropy at low temperature. The approach models a variational many-electron density matrix using two generative neural networks: an autoregressive model for momentum occupation and a normalizing flow for electron coordinates. Our calculation reveals a suppression of effective mass in the two-dimensional spin-polarized electron gas, which is more pronounced than previous reports in the low-density strong-coupling region. This prediction calls for verification in two-dimensional electron gas experiments.
In this paper, we focus on the decentralized optimization problem over the Stiefel manifold, which is defined on a connected network of $d$ agents. The objective is an average of $d$ local functions, and each function is privately held by an agent and encodes its data. The agents can only communicate with their neighbors in a collaborative effort to solve this problem. In existing methods, multiple rounds of communications are required to guarantee the convergence, giving rise to high communication costs. In contrast, this paper proposes a decentralized algorithm, called DESTINY, which only invokes a single round of communications per iteration. DESTINY combines gradient tracking techniques with a novel approximate augmented Lagrangian function. The global convergence to stationary points is rigorously established. Comprehensive numerical experiments demonstrate that DESTINY has a strong potential to deliver a cutting-edge performance in solving a variety of testing problems.
In this paper, we propose a Few-shot Learning pipeline for 3D skeleton-based action recognition by Joint tEmporal and cAmera viewpoiNt alIgnmEnt (JEANIE). To factor out misalignment between query and support sequences of 3D body joints, we propose an advanced variant of Dynamic Time Warping which jointly models each smooth path between the query and support frames to achieve simultaneously the best alignment in the temporal and simulated camera viewpoint spaces for end-to-end learning under the limited few-shot training data. Sequences are encoded with a temporal block encoder based on Simple Spectral Graph Convolution, a lightweight linear Graph Neural Network backbone (we also include a setting with a transformer). Finally, we propose a similarity-based loss which encourages the alignment of sequences of the same class while preventing the alignment of unrelated sequences. We demonstrate state-of-the-art results on NTU-60, NTU-120, Kinetics-skeleton and UWA3D Multiview Activity II.
In a quantum processor, the device design and external controls together contribute to the quality of the target quantum operations. As we continuously seek better alternative qubit platforms, we explore the increasingly large device and control design space. Thus, optimization becomes more and more challenging. In this work, we demonstrate that the figure of merit reflecting a design goal can be made differentiable with respect to the device and control parameters. In addition, we can compute the gradient of the design objective efficiently in a similar manner to the back-propagation algorithm and then utilize the gradient to optimize the device and the control parameters jointly and efficiently. This extends the scope of the quantum optimal control to superconducting device design. We also demonstrate the viability of gradient-based joint optimization over the device and control parameters through a few examples.
As the default protocol for exchanging routing reachability information on the Internet, the abnormal behavior in traffic of Border Gateway Protocols (BGP) is closely related to Internet anomaly events. The BGP anomalous detection model ensures stable routing services on the Internet through its real-time monitoring and alerting capabilities. Previous studies either focused on the feature selection problem or the memory characteristic in data, while ignoring the relationship between features and the precise time correlation in feature (whether it's long or short term dependence). In this paper, we propose a multi-view model for capturing anomalous behaviors from BGP update traffic, in which Seasonal and Trend decomposition using Loess (STL) method is used to reduce the noise in the original time-series data, and Graph Attention Network (GAT) is used to discover feature relationships and time correlations in feature, respectively. Our results outperform the state-of-the-art methods at the anomaly detection task, with the average F1 score up to 96.3% and 93.2% on the balanced and imbalanced datasets respectively. Meanwhile, our model can be extended to classify multiple anomalous and to detect unknown events.
Human action recognition still exists many challenging problems such as different viewpoints, occlusion, lighting conditions, human body size and the speed of action execution, although it has been widely used in different areas. To tackle these challenges, the Kinect depth sensor has been developed to record real time depth sequences, which are insensitive to the color of human clothes and illumination conditions. Many methods on recognizing human action have been reported in the literature such as HON4D, HOPC, RBD and HDG, which use the 4D surface normals, pointclouds, skeleton-based model and depth gradients respectively to capture discriminative information from depth videos or skeleton data. In this research project, the performance of four aforementioned algorithms will be analyzed and evaluated using five benchmark datasets, which cover challenging issues such as noise, change of viewpoints, background clutters and occlusions. We also implemented and improved the HDG algorithm, and applied it in cross-view action recognition using the UWA3D Multiview Activity dataset. Moreover, we used different combinations of individual feature vectors in HDG for performance evaluation. The experimental results show that our improvement of HDG outperforms other three state-of-the-art algorithms for cross-view action recognition.