Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Senwei Liang

Reproducing Activation Function for Deep Learning

Jan 13, 2021
Senwei Liang, Liyao Lyu, Chunmei Wang, Haizhao Yang

Figure 1 for Reproducing Activation Function for Deep Learning

Figure 2 for Reproducing Activation Function for Deep Learning

Figure 3 for Reproducing Activation Function for Deep Learning

Figure 4 for Reproducing Activation Function for Deep Learning

In this paper, we propose the reproducing activation function to improve deep learning accuracy for various applications ranging from computer vision problems to scientific computing problems. The idea of reproducing activation functions is to employ several basic functions and their learnable linear combination to construct neuron-wise data-driven activation functions for each neuron. Armed with such activation functions, deep neural networks can reproduce traditional approximation tools and, therefore, approximate target functions with a smaller number of parameters than traditional neural networks. In terms of training dynamics of deep learning, reproducing activation functions can generate neural tangent kernels with a better condition number than traditional activation functions lessening the spectral bias of deep learning. As demonstrated by extensive numerical tests, the proposed activation function can facilitate the convergence of deep learning optimization for a solution with higher accuracy than existing deep learning solvers for audio/image/video reconstruction, PDEs, and eigenvalue problems.

Via

Access Paper or Ask Questions

Quantifying spatial homogeneity of urban road networks via graph neural networks

Jan 01, 2021
Jiawei Xue, Nan Jiang, Senwei Liang, Qiyuan Pang, Satish V. Ukkusuri, Jianzhu Ma

Figure 1 for Quantifying spatial homogeneity of urban road networks via graph neural networks

Figure 2 for Quantifying spatial homogeneity of urban road networks via graph neural networks

Figure 3 for Quantifying spatial homogeneity of urban road networks via graph neural networks

Figure 4 for Quantifying spatial homogeneity of urban road networks via graph neural networks

The spatial homogeneity of an urban road network (URN) measures whether each distinct component is analogous to the whole network and can serve as a quantitative manner bridging network structure and dynamics. However, given the complexity of cities, it is challenging to quantify spatial homogeneity simply based on conventional network statistics. In this work, we use Graph Neural Networks to model the 11,790 URN samples across 30 cities worldwide and use its predictability to define the spatial homogeneity. The proposed measurement can be viewed as a non-linear integration of multiple geometric properties, such as degree, betweenness, road network type, and a strong indicator of mixed socio-economic events, such as GDP and population growth. City clusters derived from transferring spatial homogeneity can be interpreted well by continental urbanization histories. We expect this novel metric supports various subsequent tasks in transportation, urban planning, and geography.

* 22 pages, 5 figures

Via

Access Paper or Ask Questions

Efficient Attention Network: Accelerate Attention by Searching Where to Plug

Nov 28, 2020
Zhongzhan Huang, Senwei Liang, Mingfu Liang, Wei He, Haizhao Yang

Figure 1 for Efficient Attention Network: Accelerate Attention by Searching Where to Plug

Figure 2 for Efficient Attention Network: Accelerate Attention by Searching Where to Plug

Figure 3 for Efficient Attention Network: Accelerate Attention by Searching Where to Plug

Figure 4 for Efficient Attention Network: Accelerate Attention by Searching Where to Plug

Recently, many plug-and-play self-attention modules are proposed to enhance the model generalization by exploiting the internal information of deep convolutional neural networks (CNNs). Previous works lay an emphasis on the design of attention module for specific functionality, e.g., light-weighted or task-oriented attention. However, they ignore the importance of where to plug in the attention module since they connect the modules individually with each block of the entire CNN backbone for granted, leading to incremental computational cost and number of parameters with the growth of network depth. Thus, we propose a framework called Efficient Attention Network (EAN) to improve the efficiency for the existing attention modules. In EAN, we leverage the sharing mechanism (Huang et al. 2020) to share the attention module within the backbone and search where to connect the shared attention module via reinforcement learning. Finally, we obtain the attention network with sparse connections between the backbone and modules, while (1) maintaining accuracy (2) reducing extra parameter increment and (3) accelerating inference. Extensive experiments on widely-used benchmarks and popular attention networks show the effectiveness of EAN. Furthermore, we empirically illustrate that our EAN has the capacity of transferring to other tasks and capturing the informative features. The code is available at https://github.com/gbup-group/EAN-efficient-attention-network

Via

Access Paper or Ask Questions

Machine Learning for Prediction with Missing Dynamics

Oct 13, 2019
John Harlim, Shixiao W. Jiang, Senwei Liang, Haizhao Yang

Figure 1 for Machine Learning for Prediction with Missing Dynamics

Figure 2 for Machine Learning for Prediction with Missing Dynamics

Figure 3 for Machine Learning for Prediction with Missing Dynamics

Figure 4 for Machine Learning for Prediction with Missing Dynamics

This article presents a general framework for recovering missing dynamical systems using available data and machine learning techniques. The proposed framework reformulates the prediction problem as a supervised learning problem to approximate a map that takes the memories of the resolved and identifiable unresolved variables to the missing components in the resolved dynamics. We demonstrate the effectiveness of the proposed framework with a theoretical guarantee of a path-wise convergence of the resolved variables up to finite time and numerical tests on prototypical models in various scientific domains. These include the 57-mode barotropic stress models with multiscale interactions that mimic the blocked and unblocked patterns observed in the atmosphere, the nonlinear Schr\"{o}dinger equation which found many applications in physics such as optics and Bose-Einstein-Condense, the Kuramoto-Sivashinsky equation which spatiotemporal chaotic pattern formation models trapped ion mode in plasma and phase dynamics in reaction-diffusion systems. While many machine learning techniques can be used to validate the proposed framework, we found that recurrent neural networks outperform kernel regression methods in terms of recovering the trajectory of the resolved components and the equilibrium one-point and two-point statistics. This superb performance suggests that recurrent neural networks are an effective tool for recovering the missing dynamics that involves approximation of high-dimensional functions.

Via

Access Paper or Ask Questions

Instance Enhancement Batch Normalization: an Adaptive Regulator of Batch Noise

Aug 12, 2019
Senwei Liang, Zhongzhan Huang, Mingfu Liang, Haizhao Yang

Figure 1 for Instance Enhancement Batch Normalization: an Adaptive Regulator of Batch Noise

Figure 2 for Instance Enhancement Batch Normalization: an Adaptive Regulator of Batch Noise

Figure 3 for Instance Enhancement Batch Normalization: an Adaptive Regulator of Batch Noise

Figure 4 for Instance Enhancement Batch Normalization: an Adaptive Regulator of Batch Noise

Batch Normalization (BN) (Ioffe and Szegedy 2015) normalizes the features of an input image via statistics of a batch of images and this batch information is considered as batch noise that will be brought to the features of an instance by BN. We offer a point of view that self-attention mechanism can help regulate the batch noise by enhancing instance-specific information. Based on this view, we propose combining BN with a self-attention mechanism to adjust the batch noise and give an attention-based version of BN called Instance Enhancement Batch Normalization (IEBN) which recalibrates channel information by a simple linear transformation. IEBN outperforms BN with a light parameter increment in various visual tasks universally for different network structures and benchmark data sets. Besides, even if under the attack of synthetic noise, IEBN can still stabilize network training with good generalization. The code of IEBN is available at https://github.com/gbup-group/IEBN

Via

Access Paper or Ask Questions

DIANet: Dense-and-Implicit Attention Network

May 25, 2019
Zhongzhan Huang, Senwei Liang, Mingfu Liang, Haizhao Yang

Figure 1 for DIANet: Dense-and-Implicit Attention Network

Figure 2 for DIANet: Dense-and-Implicit Attention Network

Figure 3 for DIANet: Dense-and-Implicit Attention Network

Figure 4 for DIANet: Dense-and-Implicit Attention Network

Attention-based deep neural networks (DNNs) that emphasize the informative information in a local receptive field of an input image have successfully boosted the performance of deep learning in various challenging problems. In this paper, we propose a Dense-and-Implicit-Attention (DIA) unit that can be applied universally to different network architectures and enhance their generalization capacity by repeatedly fusing the information throughout different network layers. The communication of information between different layers is carried out via a modified Long Short Term Memory (LSTM) module within the DIA unit that is in parallel with the DNN. The sharing DIA unit links multi-scale features from different depth levels of the network implicitly and densely. Experiments on benchmark datasets show that the DIA unit is capable of emphasizing channel-wise feature interrelation and leads to significant improvement of image classification accuracy. We further empirically show that the DIA unit is a nonlocal normalization tool that enhances the Batch Normalization. The code is released at https://github.com/gbup-group/DIANet.

Via

Access Paper or Ask Questions

Drop-Activation: Implicit Parameter Reduction and Harmonic Regularization

Nov 19, 2018
Senwei Liang, Yuehaw Khoo, Haizhao Yang

Figure 1 for Drop-Activation: Implicit Parameter Reduction and Harmonic Regularization

Figure 2 for Drop-Activation: Implicit Parameter Reduction and Harmonic Regularization

Figure 3 for Drop-Activation: Implicit Parameter Reduction and Harmonic Regularization

Figure 4 for Drop-Activation: Implicit Parameter Reduction and Harmonic Regularization

Overfitting frequently occurs in deep learning. In this paper, we propose a novel regularization method called Drop-Activation to reduce overfitting and improve generalization. The key idea is to \emph{drop} nonlinear activation functions by setting them to be identity functions randomly during training time. During testing, we use a deterministic network with a new activation function to encode the average effect of dropping activations randomly. Experimental results on CIFAR-10, CIFAR-100, SVHN, and EMNIST show that Drop-Activation generally improves the performance of popular neural network architectures. Furthermore, unlike dropout, as a regularizer Drop-Activation can be used in harmony with standard training and regularization techniques such as Batch Normalization and AutoAug. Our theoretical analyses support the regularization effect of Drop-Activation as implicit parameter reduction and its capability to be used together with Batch Normalization.

Via

Access Paper or Ask Questions