Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matias Valdenegro-Toro

Unified Uncertainties: Combining Input, Data and Model Uncertainty into a Single Formulation

Jun 26, 2024

Matias Valdenegro-Toro, Ivo Pascal de Jong, Marco Zullich

Abstract:Modelling uncertainty in Machine Learning models is essential for achieving safe and reliable predictions. Most research on uncertainty focuses on output uncertainty (predictions), but minimal attention is paid to uncertainty at inputs. We propose a method for propagating uncertainty in the inputs through a Neural Network that is simultaneously able to estimate input, data, and model uncertainty. Our results show that this propagation of input uncertainty results in a more stable decision boundary even under large amounts of input noise than comparatively simple Monte Carlo sampling. Additionally, we discuss and demonstrate that input uncertainty, when propagated through the model, results in model uncertainty at the outputs. The explicit incorporation of input uncertainty may be beneficial in situations where the amount of input uncertainty is known, though good datasets for this are still needed.

* 4 pages, 3 figures, with appendix. LatinX in AI Research Workshop @ ICML 2024 Camera Ready

Via

Access Paper or Ask Questions

Overconfidence is Key: Verbalized Uncertainty Evaluation in Large Language and Vision-Language Models

May 05, 2024

Tobias Groot, Matias Valdenegro-Toro

Abstract:Language and Vision-Language Models (LLMs/VLMs) have revolutionized the field of AI by their ability to generate human-like text and understand images, but ensuring their reliability is crucial. This paper aims to evaluate the ability of LLMs (GPT4, GPT-3.5, LLaMA2, and PaLM 2) and VLMs (GPT4V and Gemini Pro Vision) to estimate their verbalized uncertainty via prompting. We propose the new Japanese Uncertain Scenes (JUS) dataset, aimed at testing VLM capabilities via difficult queries and object counting, and the Net Calibration Error (NCE) to measure direction of miscalibration. Results show that both LLMs and VLMs have a high calibration error and are overconfident most of the time, indicating a poor capability for uncertainty estimation. Additionally we develop prompts for regression tasks, and we show that VLMs have poor calibration when producing mean/standard deviation and 95% confidence intervals.

* 8 pages, with appendix. To appear in TrustNLP workshop @ NAACL 2024

Via

Access Paper or Ask Questions

A Neuromorphic Approach to Obstacle Avoidance in Robot Manipulation

Apr 08, 2024

Ahmed Faisal Abdelrahman, Matias Valdenegro-Toro, Maren Bennewitz, Paul G. Plöger

Abstract:Neuromorphic computing mimics computational principles of the brain in $\textit{silico}$ and motivates research into event-based vision and spiking neural networks (SNNs). Event cameras (ECs) exclusively capture local intensity changes and offer superior power consumption, response latencies, and dynamic ranges. SNNs replicate biological neuronal dynamics and have demonstrated potential as alternatives to conventional artificial neural networks (ANNs), such as in reducing energy expenditure and inference time in visual classification. Nevertheless, these novel paradigms remain scarcely explored outside the domain of aerial robots. To investigate the utility of brain-inspired sensing and data processing, we developed a neuromorphic approach to obstacle avoidance on a camera-equipped manipulator. Our approach adapts high-level trajectory plans with reactive maneuvers by processing emulated event data in a convolutional SNN, decoding neural activations into avoidance motions, and adjusting plans using a dynamic motion primitive. We conducted experiments with a Kinova Gen3 arm performing simple reaching tasks that involve obstacles in sets of distinct task scenarios and in comparison to a non-adaptive baseline. Our neuromorphic approach facilitated reliable avoidance of imminent collisions in simulated and real-world experiments, where the baseline consistently failed. Trajectory adaptations had low impacts on safety and predictability criteria. Among the notable SNN properties were the correlation of computations with the magnitude of perceived motions and a robustness to different event emulation methods. Tests with a DAVIS346 EC showed similar performance, validating our experimental event emulation. Our results motivate incorporating SNN learning, utilizing neuromorphic processors, and further exploring the potential of neuromorphic methods.

* 35 pages, accepted at IJRR, authors' version

Via

Access Paper or Ask Questions

Uncertainty Quantification for Gradient-based Explanations in Neural Networks

Mar 25, 2024

Mihir Mulye, Matias Valdenegro-Toro

Abstract:Explanation methods help understand the reasons for a model's prediction. These methods are increasingly involved in model debugging, performance optimization, and gaining insights into the workings of a model. With such critical applications of these methods, it is imperative to measure the uncertainty associated with the explanations generated by these methods. In this paper, we propose a pipeline to ascertain the explanation uncertainty of neural networks by combining uncertainty estimation methods and explanation methods. We use this pipeline to produce explanation distributions for the CIFAR-10, FER+, and California Housing datasets. By computing the coefficient of variation of these distributions, we evaluate the confidence in the explanation and determine that the explanations generated using Guided Backpropagation have low uncertainty associated with them. Additionally, we compute modified pixel insertion/deletion metrics to evaluate the quality of the generated explanations.

* 24 pages, 11 figures

Via

Access Paper or Ask Questions

Sanity Checks for Explanation Uncertainty

Mar 25, 2024

Matias Valdenegro-Toro, Mihir Mulye

Abstract:Explanations for machine learning models can be hard to interpret or be wrong. Combining an explanation method with an uncertainty estimation method produces explanation uncertainty. Evaluating explanation uncertainty is difficult. In this paper we propose sanity checks for uncertainty explanation methods, where a weight and data randomization tests are defined for explanations with uncertainty, allowing for quick tests to combinations of uncertainty and explanation methods. We experimentally show the validity and effectiveness of these tests on the CIFAR10 and California Housing datasets, noting that Ensembles seem to consistently pass both tests with Guided Backpropagation, Integrated Gradients, and LIME explanations.

* 15 pages, 7 figures, 3 tables

Via

Access Paper or Ask Questions

Uncertainty Quantification for cross-subject Motor Imagery classification

Mar 14, 2024

Prithviraj Manivannan, Ivo Pascal de Jong, Matias Valdenegro-Toro, Andreea Ioana Sburlea

Abstract:Uncertainty Quantification aims to determine when the prediction from a Machine Learning model is likely to be wrong. Computer Vision research has explored methods for determining epistemic uncertainty (also known as model uncertainty), which should correspond with generalisation error. These methods theoretically allow to predict misclassifications due to inter-subject variability. We applied a variety of Uncertainty Quantification methods to predict misclassifications for a Motor Imagery Brain Computer Interface. Deep Ensembles performed best, both in terms of classification performance and cross-subject Uncertainty Quantification performance. However, we found that standard CNNs with Softmax output performed better than some of the more advanced methods.

Via

Access Paper or Ask Questions

Transferring BCI models from calibration to control: Observing shifts in EEG features

Mar 14, 2024

Ivo Pascal de Jong, Lüke Luna van den Wittenboer, Matias Valdenegro-Toro, Andreea Ioana Sburlea

Abstract:Public Motor Imagery-based brain-computer interface (BCI) datasets are being used to develop increasingly good classifiers. However, they usually follow discrete paradigms where participants perform Motor Imagery at regularly timed intervals. It is often unclear what changes may happen in the EEG patterns when users attempt to perform a control task with such a BCI. This may lead to generalisation errors. We demonstrate a new paradigm containing a standard calibration session and a novel BCI control session based on EMG. This allows us to observe similarities in sensorimotor rhythms, and observe the additional preparation effects introduced by the control paradigm. In the Movement Related Cortical Potentials we found large differences between the calibration and control sessions. We demonstrate a CSP-based Machine Learning model trained on the calibration data that can make surprisingly good predictions on the BCI-controlled driving data.

Via

Access Paper or Ask Questions

Mapping Transformer Leveraged Embeddings for Cross-Lingual Document Representation

Jan 12, 2024

Tsegaye Misikir Tashu, Eduard-Raul Kontos, Matthia Sabatelli, Matias Valdenegro-Toro

Figure 1 for Mapping Transformer Leveraged Embeddings for Cross-Lingual Document Representation

Figure 2 for Mapping Transformer Leveraged Embeddings for Cross-Lingual Document Representation

Figure 3 for Mapping Transformer Leveraged Embeddings for Cross-Lingual Document Representation

Figure 4 for Mapping Transformer Leveraged Embeddings for Cross-Lingual Document Representation

Abstract:Recommendation systems, for documents, have become tools to find relevant content on the Web. However, these systems have limitations when it comes to recommending documents in languages different from the query language, which means they might overlook resources in non-native languages. This research focuses on representing documents across languages by using Transformer Leveraged Document Representations (TLDRs) that are mapped to a cross-lingual domain. Four multilingual pre-trained transformer models (mBERT, mT5 XLM RoBERTa, ErnieM) were evaluated using three mapping methods across 20 language pairs representing combinations of five selected languages of the European Union. Metrics like Mate Retrieval Rate and Reciprocal Rank were used to measure the effectiveness of mapped TLDRs compared to non-mapped ones. The results highlight the power of cross-lingual representations achieved through pre-trained transformers and mapping approaches suggesting a promising direction for expanding beyond language connections, between two specific languages.

Via

Access Paper or Ask Questions

ChatGPT Prompting Cannot Estimate Predictive Uncertainty in High-Resource Languages

Nov 10, 2023

Martino Pelucchi, Matias Valdenegro-Toro

Abstract:ChatGPT took the world by storm for its impressive abilities. Due to its release without documentation, scientists immediately attempted to identify its limits, mainly through its performance in natural language processing (NLP) tasks. This paper aims to join the growing literature regarding ChatGPT's abilities by focusing on its performance in high-resource languages and on its capacity to predict its answers' accuracy by giving a confidence level. The analysis of high-resource languages is of interest as studies have shown that low-resource languages perform worse than English in NLP tasks, but no study so far has analysed whether high-resource languages perform as well as English. The analysis of ChatGPT's confidence calibration has not been carried out before either and is critical to learn about ChatGPT's trustworthiness. In order to study these two aspects, five high-resource languages and two NLP tasks were chosen. ChatGPT was asked to perform both tasks in the five languages and to give a numerical confidence value for each answer. The results show that all the selected high-resource languages perform similarly and that ChatGPT does not have a good confidence calibration, often being overconfident and never giving low confidence values.

* 14 pages, 4 figures, with appendix

Via

Access Paper or Ask Questions

Sanity Checks for Saliency Methods Explaining Object Detectors

Jun 04, 2023

Deepan Chakravarthi Padmanabhan, Paul G. Plöger, Octavio Arriaga, Matias Valdenegro-Toro

Figure 1 for Sanity Checks for Saliency Methods Explaining Object Detectors

Figure 2 for Sanity Checks for Saliency Methods Explaining Object Detectors

Figure 3 for Sanity Checks for Saliency Methods Explaining Object Detectors

Figure 4 for Sanity Checks for Saliency Methods Explaining Object Detectors

Abstract:Saliency methods are frequently used to explain Deep Neural Network-based models. Adebayo et al.'s work on evaluating saliency methods for classification models illustrate certain explanation methods fail the model and data randomization tests. However, on extending the tests for various state of the art object detectors we illustrate that the ability to explain a model is more dependent on the model itself than the explanation method. We perform sanity checks for object detection and define new qualitative criteria to evaluate the saliency explanations, both for object classification and bounding box decisions, using Guided Backpropagation, Integrated Gradients, and their Smoothgrad versions, together with Faster R-CNN, SSD, and EfficientDet-D0, trained on COCO. In addition, the sensitivity of the explanation method to model parameters and data labels varies class-wise motivating to perform the sanity checks for each class. We find that EfficientDet-D0 is the most interpretable method independent of the saliency method, which passes the sanity checks with little problems.

* 18 pages, 10 figures, 1st World Conference on eXplainable Artificial Intelligence camera ready

Via

Access Paper or Ask Questions