Alert button
Picture for Yangfan Zhou

Yangfan Zhou

Alert button

FastAdaBelief: Improving Convergence Rate for Belief-based Adaptive Optimizer by Strong Convexity

Apr 28, 2021
Yangfan Zhou, Kaizhu Huang, Cheng Cheng, Xuguang Wang, Xin Liu

Figure 1 for FastAdaBelief: Improving Convergence Rate for Belief-based Adaptive Optimizer by Strong Convexity
Figure 2 for FastAdaBelief: Improving Convergence Rate for Belief-based Adaptive Optimizer by Strong Convexity
Figure 3 for FastAdaBelief: Improving Convergence Rate for Belief-based Adaptive Optimizer by Strong Convexity
Figure 4 for FastAdaBelief: Improving Convergence Rate for Belief-based Adaptive Optimizer by Strong Convexity

The AdaBelief algorithm demonstrates superior generalization ability to the Adam algorithm by viewing the exponential moving average of observed gradients. AdaBelief is proved to have a data-dependent $O(\sqrt{T})$ regret bound when objective functions are convex, where $T$ is a time horizon. However, it remains to be an open problem on how to exploit strong convexity to further improve the convergence rate of AdaBelief. To tackle this problem, we present a novel optimization algorithm under strong convexity, called FastAdaBelief. We prove that FastAdaBelief attains a data-dependant $O(\log T)$ regret bound, which is substantially lower than AdaBelief. In addition, the theoretical analysis is validated by extensive experiments performed on open datasets (i.e., CIFAR-10 and Penn Treebank) for image classification and language modeling.

Viaarxiv icon

Detecting Deep Neural Network Defects with Data Flow Analysis

Sep 30, 2019
Jiazhen Gu, Huanlin Xu, Yangfan Zhou, Xin Wang, Hui Xu, Michael Lyu

Figure 1 for Detecting Deep Neural Network Defects with Data Flow Analysis
Figure 2 for Detecting Deep Neural Network Defects with Data Flow Analysis
Figure 3 for Detecting Deep Neural Network Defects with Data Flow Analysis
Figure 4 for Detecting Deep Neural Network Defects with Data Flow Analysis

Deep neural networks (DNNs) are shown to be promising solutions in many challenging artificial intelligence tasks. However, it is very hard to figure out whether the low precision of a DNN model is an inevitable result, or caused by defects. This paper aims at addressing this challenging problem. We find that the internal data flow footprints of a DNN model can provide insights to locate the root cause effectively. We develop DeepMorph (DNN Tomography) to analyze the root cause, which can guide a DNN developer to improve the model.

* 2 pages 
Viaarxiv icon

Data Sanity Check for Deep Learning Systems via Learnt Assertions

Sep 28, 2019
Haochuan Lu, Huanlin Xu, Nana Liu, Yangfan Zhou, Xin Wang

Figure 1 for Data Sanity Check for Deep Learning Systems via Learnt Assertions
Figure 2 for Data Sanity Check for Deep Learning Systems via Learnt Assertions

Reliability is a critical consideration to DL-based systems. But the statistical nature of DL makes it quite vulnerable to invalid inputs, i.e., those cases that are not considered in the training phase of a DL model. This paper proposes to perform data sanity check to identify invalid inputs, so as to enhance the reliability of DL-based systems. We design and implement a tool to detect behavior deviation of a DL model when processing an input case. This tool extracts the data flow footprints and conducts an assertion-based validation mechanism. The assertions are built automatically, which are specifically-tailored for DL model data flow analysis. Our experiments conducted with real-world scenarios demonstrate that such an assertion-based data sanity check mechanism is effective in identifying invalid input cases.

Viaarxiv icon