Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lam M. Nguyen

One step closer to unbiased aleatoric uncertainty estimation

Dec 20, 2023

Wang Zhang, Ziwen Ma, Subhro Das, Tsui-Wei Weng, Alexandre Megretski, Luca Daniel, Lam M. Nguyen

Figure 1 for One step closer to unbiased aleatoric uncertainty estimation

Figure 2 for One step closer to unbiased aleatoric uncertainty estimation

Figure 3 for One step closer to unbiased aleatoric uncertainty estimation

Figure 4 for One step closer to unbiased aleatoric uncertainty estimation

Abstract:Neural networks are powerful tools in various applications, and quantifying their uncertainty is crucial for reliable decision-making. In the deep learning field, the uncertainties are usually categorized into aleatoric (data) and epistemic (model) uncertainty. In this paper, we point out that the existing popular variance attenuation method highly overestimates aleatoric uncertainty. To address this issue, we propose a new estimation method by actively de-noising the observed data. By conducting a broad range of experiments, we demonstrate that our proposed approach provides a much closer approximation to the actual data uncertainty than the standard method.

Via

Access Paper or Ask Questions

A Supervised Contrastive Learning Pretrain-Finetune Approach for Time Series

Nov 21, 2023

Trang H. Tran, Lam M. Nguyen, Kyongmin Yeo, Nam Nguyen, Roman Vaculin

Figure 1 for A Supervised Contrastive Learning Pretrain-Finetune Approach for Time Series

Figure 2 for A Supervised Contrastive Learning Pretrain-Finetune Approach for Time Series

Figure 3 for A Supervised Contrastive Learning Pretrain-Finetune Approach for Time Series

Figure 4 for A Supervised Contrastive Learning Pretrain-Finetune Approach for Time Series

Abstract:Foundation models have recently gained attention within the field of machine learning thanks to its efficiency in broad data processing. While researchers had attempted to extend this success to time series models, the main challenge is effectively extracting representations and transferring knowledge from pretraining datasets to the target finetuning dataset. To tackle this issue, we introduce a novel pretraining procedure that leverages supervised contrastive learning to distinguish features within each pretraining dataset. This pretraining phase enables a probabilistic similarity metric, which assesses the likelihood of a univariate sample being closely related to one of the pretraining datasets. Subsequently, using this similarity metric as a guide, we propose a fine-tuning procedure designed to enhance the accurate prediction of the target data by aligning it more closely with the learned dynamics of the pretraining datasets. Our experiments have shown promising results which demonstrate the efficacy of our approach.

Via

Access Paper or Ask Questions

Correlated Attention in Transformers for Multivariate Time Series

Nov 20, 2023

Quang Minh Nguyen, Lam M. Nguyen, Subhro Das

Figure 1 for Correlated Attention in Transformers for Multivariate Time Series

Figure 2 for Correlated Attention in Transformers for Multivariate Time Series

Figure 3 for Correlated Attention in Transformers for Multivariate Time Series

Figure 4 for Correlated Attention in Transformers for Multivariate Time Series

Abstract:Multivariate time series (MTS) analysis prevails in real-world applications such as finance, climate science and healthcare. The various self-attention mechanisms, the backbone of the state-of-the-art Transformer-based models, efficiently discover the temporal dependencies, yet cannot well capture the intricate cross-correlation between different features of MTS data, which inherently stems from complex dynamical systems in practice. To this end, we propose a novel correlated attention mechanism, which not only efficiently captures feature-wise dependencies, but can also be seamlessly integrated within the encoder blocks of existing well-known Transformers to gain efficiency improvement. In particular, correlated attention operates across feature channels to compute cross-covariance matrices between queries and keys with different lag values, and selectively aggregate representations at the sub-series level. This architecture facilitates automated discovery and representation learning of not only instantaneous but also lagged cross-correlations, while inherently capturing time series auto-correlation. When combined with prevalent Transformer baselines, correlated attention mechanism constitutes a better alternative for encoder-only architectures, which are suitable for a wide range of tasks including imputation, anomaly detection and classification. Extensive experiments on the aforementioned tasks consistently underscore the advantages of correlated attention mechanism in enhancing base Transformer models, and demonstrate our state-of-the-art results in imputation, anomaly detection and classification.

Via

Access Paper or Ask Questions

Promoting Robustness of Randomized Smoothing: Two Cost-Effective Approaches

Oct 11, 2023

Linbo Liu, Trong Nghia Hoang, Lam M. Nguyen, Tsui-Wei Weng

Abstract:Randomized smoothing has recently attracted attentions in the field of adversarial robustness to provide provable robustness guarantees on smoothed neural network classifiers. However, existing works show that vanilla randomized smoothing usually does not provide good robustness performance and often requires (re)training techniques on the base classifier in order to boost the robustness of the resulting smoothed classifier. In this work, we propose two cost-effective approaches to boost the robustness of randomized smoothing while preserving its clean performance. The first approach introduces a new robust training method AdvMacerwhich combines adversarial training and robustness certification maximization for randomized smoothing. We show that AdvMacer can improve the robustness performance of randomized smoothing classifiers compared to SOTA baselines, while being 3x faster to train than MACER baseline. The second approach introduces a post-processing method EsbRS which greatly improves the robustness certificate based on building model ensembles. We explore different aspects of model ensembles that has not been studied by prior works and propose a novel design methodology to further improve robustness of the ensemble based on our theoretical analysis.

Via

Access Paper or Ask Questions

Batch Clipping and Adaptive Layerwise Clipping for Differential Private Stochastic Gradient Descent

Jul 21, 2023

Toan N. Nguyen, Phuong Ha Nguyen, Lam M. Nguyen, Marten Van Dijk

Abstract:Each round in Differential Private Stochastic Gradient Descent (DPSGD) transmits a sum of clipped gradients obfuscated with Gaussian noise to a central server which uses this to update a global model which often represents a deep neural network. Since the clipped gradients are computed separately, which we call Individual Clipping (IC), deep neural networks like resnet-18 cannot use Batch Normalization Layers (BNL) which is a crucial component in deep neural networks for achieving a high accuracy. To utilize BNL, we introduce Batch Clipping (BC) where, instead of clipping single gradients as in the orginal DPSGD, we average and clip batches of gradients. Moreover, the model entries of different layers have different sensitivities to the added Gaussian noise. Therefore, Adaptive Layerwise Clipping methods (ALC), where each layer has its own adaptively finetuned clipping constant, have been introduced and studied, but so far without rigorous DP proofs. In this paper, we propose {\em a new ALC and provide rigorous DP proofs for both BC and ALC}. Experiments show that our modified DPSGD with BC and ALC for CIFAR-$10$ with resnet-$18$ converges while DPSGD with IC and ALC does not.

* 20 pages, 18 Figures

Via

Access Paper or Ask Questions

Learning Robust and Consistent Time Series Representations: A Dilated Inception-Based Approach

Jun 11, 2023

Anh Duy Nguyen, Trang H. Tran, Hieu H. Pham, Phi Le Nguyen, Lam M. Nguyen

Figure 1 for Learning Robust and Consistent Time Series Representations: A Dilated Inception-Based Approach

Figure 2 for Learning Robust and Consistent Time Series Representations: A Dilated Inception-Based Approach

Figure 3 for Learning Robust and Consistent Time Series Representations: A Dilated Inception-Based Approach

Figure 4 for Learning Robust and Consistent Time Series Representations: A Dilated Inception-Based Approach

Abstract:Representation learning for time series has been an important research area for decades. Since the emergence of the foundation models, this topic has attracted a lot of attention in contrastive self-supervised learning, to solve a wide range of downstream tasks. However, there have been several challenges for contrastive time series processing. First, there is no work considering noise, which is one of the critical factors affecting the efficacy of time series tasks. Second, there is a lack of efficient yet lightweight encoder architectures that can learn informative representations robust to various downstream tasks. To fill in these gaps, we initiate a novel sampling strategy that promotes consistent representation learning with the presence of noise in natural time series. In addition, we propose an encoder architecture that utilizes dilated convolution within the Inception block to create a scalable and robust network architecture with a wide receptive field. Experiments demonstrate that our method consistently outperforms state-of-the-art methods in forecasting, classification, and abnormality detection tasks, e.g. ranks first over two-thirds of the classification UCR datasets, with only $40\%$ of the parameters compared to the second-best approach. Our source code for CoInception framework is accessible at https://github.com/anhduy0911/CoInception.

Via

Access Paper or Ask Questions

An End-to-End Time Series Model for Simultaneous Imputation and Forecast

Jun 01, 2023

Trang H. Tran, Lam M. Nguyen, Kyongmin Yeo, Nam Nguyen, Dzung Phan, Roman Vaculin, Jayant Kalagnanam

Figure 1 for An End-to-End Time Series Model for Simultaneous Imputation and Forecast

Figure 2 for An End-to-End Time Series Model for Simultaneous Imputation and Forecast

Figure 3 for An End-to-End Time Series Model for Simultaneous Imputation and Forecast

Figure 4 for An End-to-End Time Series Model for Simultaneous Imputation and Forecast

Abstract:Time series forecasting using historical data has been an interesting and challenging topic, especially when the data is corrupted by missing values. In many industrial problem, it is important to learn the inference function between the auxiliary observations and target variables as it provides additional knowledge when the data is not fully observed. We develop an end-to-end time series model that aims to learn the such inference relation and make a multiple-step ahead forecast. Our framework trains jointly two neural networks, one to learn the feature-wise correlations and the other for the modeling of temporal behaviors. Our model is capable of simultaneously imputing the missing entries and making a multiple-step ahead prediction. The experiments show good overall performance of our framework over existing methods in both imputation and forecasting tasks.

Via

Access Paper or Ask Questions

Label-Free Concept Bottleneck Models

Apr 12, 2023

Tuomas Oikarinen, Subhro Das, Lam M. Nguyen, Tsui-Wei Weng

Figure 1 for Label-Free Concept Bottleneck Models

Figure 2 for Label-Free Concept Bottleneck Models

Figure 3 for Label-Free Concept Bottleneck Models

Figure 4 for Label-Free Concept Bottleneck Models

Abstract:Concept bottleneck models (CBM) are a popular way of creating more interpretable neural networks by having hidden layer neurons correspond to human-understandable concepts. However, existing CBMs and their variants have two crucial limitations: first, they need to collect labeled data for each of the predefined concepts, which is time consuming and labor intensive; second, the accuracy of a CBM is often significantly lower than that of a standard neural network, especially on more complex datasets. This poor performance creates a barrier for adopting CBMs in practical real world applications. Motivated by these challenges, we propose Label-free CBM which is a novel framework to transform any neural network into an interpretable CBM without labeled concept data, while retaining a high accuracy. Our Label-free CBM has many advantages, it is: scalable - we present the first CBM scaled to ImageNet, efficient - creating a CBM takes only a few hours even for very large datasets, and automated - training it for a new dataset requires minimal human effort. Our code is available at https://github.com/Trustworthy-ML-Lab/Label-free-CBM.

* Published at ICLR 2023

Via

Access Paper or Ask Questions

ConCerNet: A Contrastive Learning Based Framework for Automated Conservation Law Discovery and Trustworthy Dynamical System Prediction

Feb 11, 2023

Wang Zhang, Tsui-Wei Weng, Subhro Das, Alexandre Megretski, Luca Daniel, Lam M. Nguyen

Abstract:Deep neural networks (DNN) have shown great capacity of modeling a dynamical system; nevertheless, they usually do not obey physics constraints such as conservation laws. This paper proposes a new learning framework named ConCerNet to improve the trustworthiness of the DNN based dynamics modeling to endow the invariant properties. ConCerNet consists of two steps: (i) a contrastive learning method to automatically capture the system invariants (i.e. conservation properties) along the trajectory observations; (ii) a neural projection layer to guarantee that the learned dynamics models preserve the learned invariants. We theoretically prove the functional relationship between the learned latent representation and the unknown system invariant function. Experiments show that our method consistently outperforms the baseline neural networks in both coordinate error and conservation metrics by a large margin. With neural network based parameterization and no dependence on prior knowledge, our method can be extended to complex and large-scale dynamics by leveraging an autoencoder.

* 22 pages, 7 figures

Via

Access Paper or Ask Questions

Generalizing DP-SGD with Shuffling and Batching Clipping

Dec 12, 2022

Marten van Dijk, Phuong Ha Nguyen, Toan N. Nguyen, Lam M. Nguyen

Abstract:Classical differential private DP-SGD implements individual clipping with random subsampling, which forces a mini-batch SGD approach. We provide a general differential private algorithmic framework that goes beyond DP-SGD and allows any possible first order optimizers (e.g., classical SGD and momentum based SGD approaches) in combination with batch clipping, which clips an aggregate of computed gradients rather than summing clipped gradients (as is done in individual clipping). The framework also admits sampling techniques beyond random subsampling such as shuffling. Our DP analysis follows the $f$-DP approach and introduces a new proof technique which allows us to also analyse group privacy. In particular, for $E$ epochs work and groups of size $g$, we show a $\sqrt{g E}$ DP dependency for batch clipping with shuffling. This is much better than the previously anticipated linear dependency in $g$ and is much better than the previously expected square root dependency on the total number of rounds within $E$ epochs which is generally much more than $\sqrt{E}$.

* 38 pages, 0 figure

Via

Access Paper or Ask Questions