Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wen Shen

Batch Normalization Is Blind to the First and Second Derivatives of the Loss

Jun 02, 2022

Zhanpeng Zhou, Wen Shen, Huixin Chen, Ling Tang, Quanshi Zhang

Figure 1 for Batch Normalization Is Blind to the First and Second Derivatives of the Loss

Figure 2 for Batch Normalization Is Blind to the First and Second Derivatives of the Loss

Figure 3 for Batch Normalization Is Blind to the First and Second Derivatives of the Loss

Figure 4 for Batch Normalization Is Blind to the First and Second Derivatives of the Loss

Abstract:In this paper, we prove the effects of the BN operation on the back-propagation of the first and second derivatives of the loss. When we do the Taylor series expansion of the loss function, we prove that the BN operation will block the influence of the first-order term and most influence of the second-order term of the loss. We also find that such a problem is caused by the standardization phase of the BN operation. Experimental results have verified our theoretical conclusions, and we have found that the BN operation significantly affects feature representations in specific tasks, where losses of different samples share similar analytic formulas.

Via

Access Paper or Ask Questions

Why Adversarial Training of ReLU Networks Is Difficult?

May 30, 2022

Xu Cheng, Hao Zhang, Yue Xin, Wen Shen, Jie Ren, Quanshi Zhang

Figure 1 for Why Adversarial Training of ReLU Networks Is Difficult?

Figure 2 for Why Adversarial Training of ReLU Networks Is Difficult?

Figure 3 for Why Adversarial Training of ReLU Networks Is Difficult?

Figure 4 for Why Adversarial Training of ReLU Networks Is Difficult?

Abstract:This paper mathematically derives an analytic solution of the adversarial perturbation on a ReLU network, and theoretically explains the difficulty of adversarial training. Specifically, we formulate the dynamics of the adversarial perturbation generated by the multi-step attack, which shows that the adversarial perturbation tends to strengthen eigenvectors corresponding to a few top-ranked eigenvalues of the Hessian matrix of the loss w.r.t. the input. We also prove that adversarial training tends to strengthen the influence of unconfident input samples with large gradient norms in an exponential manner. Besides, we find that adversarial training strengthens the influence of the Hessian matrix of the loss w.r.t. network parameters, which makes the adversarial training more likely to oscillate along directions of a few samples, and boosts the difficulty of adversarial training. Crucially, our proofs provide a unified explanation for previous findings in understanding adversarial training.

Via

Access Paper or Ask Questions

Interpreting Representation Quality of DNNs for 3D Point Cloud Processing

Nov 05, 2021

Wen Shen, Qihan Ren, Dongrui Liu, Quanshi Zhang

Figure 1 for Interpreting Representation Quality of DNNs for 3D Point Cloud Processing

Figure 2 for Interpreting Representation Quality of DNNs for 3D Point Cloud Processing

Figure 3 for Interpreting Representation Quality of DNNs for 3D Point Cloud Processing

Figure 4 for Interpreting Representation Quality of DNNs for 3D Point Cloud Processing

Abstract:In this paper, we evaluate the quality of knowledge representations encoded in deep neural networks (DNNs) for 3D point cloud processing. We propose a method to disentangle the overall model vulnerability into the sensitivity to the rotation, the translation, the scale, and local 3D structures. Besides, we also propose metrics to evaluate the spatial smoothness of encoding 3D structures, and the representation complexity of the DNN. Based on such analysis, experiments expose representation problems with classic DNNs, and explain the utility of the adversarial training.

Via

Access Paper or Ask Questions

Interpretable Compositional Convolutional Neural Networks

Jul 09, 2021

Wen Shen, Zhihua Wei, Shikun Huang, Binbin Zhang, Jiaqi Fan, Ping Zhao, Quanshi Zhang

Figure 1 for Interpretable Compositional Convolutional Neural Networks

Figure 2 for Interpretable Compositional Convolutional Neural Networks

Figure 3 for Interpretable Compositional Convolutional Neural Networks

Figure 4 for Interpretable Compositional Convolutional Neural Networks

Abstract:The reasonable definition of semantic interpretability presents the core challenge in explainable AI. This paper proposes a method to modify a traditional convolutional neural network (CNN) into an interpretable compositional CNN, in order to learn filters that encode meaningful visual patterns in intermediate convolutional layers. In a compositional CNN, each filter is supposed to consistently represent a specific compositional object part or image region with a clear meaning. The compositional CNN learns from image labels for classification without any annotations of parts or regions for supervision. Our method can be broadly applied to different types of CNNs. Experiments have demonstrated the effectiveness of our method.

* IJCAI2021

Via

Access Paper or Ask Questions

Automated Segmentation of Brain Gray Matter Nuclei on Quantitative Susceptibility Mapping Using Deep Convolutional Neural Network

Aug 03, 2020

Chao Chai, Pengchong Qiao, Bin Zhao, Huiying Wang, Guohua Liu, Hong Wu, E Mark Haacke, Wen Shen, Chen Cao, Xinchen Ye(+2 more)

Figure 1 for Automated Segmentation of Brain Gray Matter Nuclei on Quantitative Susceptibility Mapping Using Deep Convolutional Neural Network

Figure 2 for Automated Segmentation of Brain Gray Matter Nuclei on Quantitative Susceptibility Mapping Using Deep Convolutional Neural Network

Figure 3 for Automated Segmentation of Brain Gray Matter Nuclei on Quantitative Susceptibility Mapping Using Deep Convolutional Neural Network

Figure 4 for Automated Segmentation of Brain Gray Matter Nuclei on Quantitative Susceptibility Mapping Using Deep Convolutional Neural Network

Abstract:Abnormal iron accumulation in the brain subcortical nuclei has been reported to be correlated to various neurodegenerative diseases, which can be measured through the magnetic susceptibility from the quantitative susceptibility mapping (QSM). To quantitively measure the magnetic susceptibility, the nuclei should be accurately segmented, which is a tedious task for clinicians. In this paper, we proposed a double-branch residual-structured U-Net (DB-ResUNet) based on 3D convolutional neural network (CNN) to automatically segment such brain gray matter nuclei. To better tradeoff between segmentation accuracy and the memory efficiency, the proposed DB-ResUNet fed image patches with high resolution and the patches with low resolution but larger field of view into the local and global branches, respectively. Experimental results revealed that by jointly using QSM and T$_\text{1}$ weighted imaging (T$_\text{1}$WI) as inputs, the proposed method was able to achieve better segmentation accuracy over its single-branch counterpart, as well as the conventional atlas-based method and the classical 3D-UNet structure. The susceptibility values and the volumes were also measured, which indicated that the measurements from the proposed DB-ResUNet are able to present high correlation with values from the manually annotated regions of interest.

* submitted to IEEE Transactions on Medical Imaging

Via

Access Paper or Ask Questions

Utility Analysis of Network Architectures for 3D Point Cloud Processing

Nov 20, 2019

Shikun Huang, Binbin Zhang, Wen Shen, Zhihua Wei, Quanshi Zhang

Figure 1 for Utility Analysis of Network Architectures for 3D Point Cloud Processing

Figure 2 for Utility Analysis of Network Architectures for 3D Point Cloud Processing

Figure 3 for Utility Analysis of Network Architectures for 3D Point Cloud Processing

Figure 4 for Utility Analysis of Network Architectures for 3D Point Cloud Processing

Abstract:In this paper, we diagnose deep neural networks for 3D point cloud processing to explore utilities of different network architectures. We propose a number of hypotheses on the effects of specific network architectures on the representation capacity of DNNs. In order to prove the hypotheses, we design five metrics to diagnose various types of DNNs from the following perspectives, information discarding, information concentration, rotation robustness, adversarial robustness, and neighborhood inconsistency. We conduct comparative studies based on such metrics to verify the hypotheses. We further use the verified hypotheses to revise architectures of existing DNNs to improve their utilities. Experiments demonstrate the effectiveness of our method.

Via

Access Paper or Ask Questions

3D-Rotation-Equivariant Quaternion Neural Networks

Nov 20, 2019

Binbin Zhang, Wen Shen, Shikun Huang, Zhihua Wei, Quanshi Zhang

Figure 1 for 3D-Rotation-Equivariant Quaternion Neural Networks

Figure 2 for 3D-Rotation-Equivariant Quaternion Neural Networks

Figure 3 for 3D-Rotation-Equivariant Quaternion Neural Networks

Figure 4 for 3D-Rotation-Equivariant Quaternion Neural Networks

Abstract:This paper proposes a set of rules to revise various neural networks for 3D point cloud processing to rotation-equivariant quaternion neural networks (REQNNs). We find that when a neural network uses quaternion features under certain conditions, the network feature naturally has the rotation-equivariance property. Rotation equivariance means that applying a specific rotation transformation to the input point cloud is equivalent to applying the same rotation transformation to all intermediate-layer quaternion features. Besides, the REQNN also ensures that the intermediate-layer features are invariant to the permutation of input points. Compared with the original neural network, the REQNN exhibits higher rotation robustness.

Via

Access Paper or Ask Questions

Information Design in Crowdfunding under Thresholding Policies

Mar 28, 2018

Wen Shen, Jacob W. Crandall, Ke Yan, Cristina V. Lopes

Figure 1 for Information Design in Crowdfunding under Thresholding Policies

Figure 2 for Information Design in Crowdfunding under Thresholding Policies

Abstract:Crowdfunding has emerged as a prominent way for entrepreneurs to secure funding without sophisticated intermediation. In crowdfunding, an entrepreneur often has to decide how to disclose the campaign status in order to collect as many contributions as possible. Such decisions are difficult to make primarily due to incomplete information. We propose information design as a tool to help the entrepreneur to improve revenue by influencing backers' beliefs. We introduce a heuristic algorithm to dynamically compute information-disclosure policies for the entrepreneur, followed by an empirical evaluation to demonstrate its competitiveness over the widely-adopted immediate-disclosure policy. Our results demonstrate that the immediate-disclosure policy is not optimal when backers follow thresholding policies despite its ease of implementation. With appropriate heuristics, an entrepreneur can benefit from dynamic information disclosure. Our work sheds light on information design in a dynamic setting where agents make decisions using thresholding policies.

* 9 pages, 2 figures, In Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2018)

Via

Access Paper or Ask Questions

Regulating Highly Automated Robot Ecologies: Insights from Three User Studies

Aug 07, 2017

Wen Shen, Alanoud Al Khemeiri, Abdulla Almehrezi, Wael Al Enezi, Iyad Rahwan, Jacob W. Crandall

Figure 1 for Regulating Highly Automated Robot Ecologies: Insights from Three User Studies

Figure 2 for Regulating Highly Automated Robot Ecologies: Insights from Three User Studies

Figure 3 for Regulating Highly Automated Robot Ecologies: Insights from Three User Studies

Figure 4 for Regulating Highly Automated Robot Ecologies: Insights from Three User Studies

Abstract:Highly automated robot ecologies (HARE), or societies of independent autonomous robots or agents, are rapidly becoming an important part of much of the world's critical infrastructure. As with human societies, regulation, wherein a governing body designs rules and processes for the society, plays an important role in ensuring that HARE meet societal objectives. However, to date, a careful study of interactions between a regulator and HARE is lacking. In this paper, we report on three user studies which give insights into how to design systems that allow people, acting as the regulatory authority, to effectively interact with HARE. As in the study of political systems in which governments regulate human societies, our studies analyze how interactions between HARE and regulators are impacted by regulatory power and individual (robot or agent) autonomy. Our results show that regulator power, decision support, and adaptive autonomy can each diminish the social welfare of HARE, and hint at how these seemingly desirable mechanisms can be designed so that they become part of successful HARE.

* In Proceedings of the 5th International Conference on Human Agent Interaction (HAI 2017). ACM, New York, NY, USA, 111-120
* 10 pages, 7 figures, to appear in the 5th International Conference on Human Agent Interaction (HAI-2017), Bielefeld, Germany

Via

Access Paper or Ask Questions

An Online Mechanism for Ridesharing in Autonomous Mobility-on-Demand Systems

Mar 02, 2017

Wen Shen, Cristina V. Lopes, Jacob W. Crandall

Figure 1 for An Online Mechanism for Ridesharing in Autonomous Mobility-on-Demand Systems

Figure 2 for An Online Mechanism for Ridesharing in Autonomous Mobility-on-Demand Systems

Figure 3 for An Online Mechanism for Ridesharing in Autonomous Mobility-on-Demand Systems

Figure 4 for An Online Mechanism for Ridesharing in Autonomous Mobility-on-Demand Systems

Abstract:With proper management, Autonomous Mobility-on-Demand (AMoD) systems have great potential to satisfy the transport demands of urban populations by providing safe, convenient, and affordable ridesharing services. Meanwhile, such systems can substantially decrease private car ownership and use, and thus significantly reduce traffic congestion, energy consumption, and carbon emissions. To achieve this objective, an AMoD system requires private information about the demand from passengers. However, due to self-interestedness, passengers are unlikely to cooperate with the service providers in this regard. Therefore, an online mechanism is desirable if it incentivizes passengers to truthfully report their actual demand. For the purpose of promoting ridesharing, we hereby introduce a posted-price, integrated online ridesharing mechanism (IORS) that satisfies desirable properties such as ex-post incentive compatibility, individual rationality, and budget-balance. Numerical results indicate the competitiveness of IORS compared with two benchmarks, namely the optimal assignment and an offline, auction-based mechanism.

* Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI 2016) pp. 475-481

Via

Access Paper or Ask Questions