Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

"Information": models, code, and papers

Neural Polarizer: A Lightweight and Effective Backdoor Defense via Purifying Poisoned Features

Jun 29, 2023
Mingli Zhu, Shaokui Wei, Hongyuan Zha, Baoyuan Wu

Figure 1 for Neural Polarizer: A Lightweight and Effective Backdoor Defense via Purifying Poisoned Features

Figure 2 for Neural Polarizer: A Lightweight and Effective Backdoor Defense via Purifying Poisoned Features

Figure 3 for Neural Polarizer: A Lightweight and Effective Backdoor Defense via Purifying Poisoned Features

Figure 4 for Neural Polarizer: A Lightweight and Effective Backdoor Defense via Purifying Poisoned Features

Recent studies have demonstrated the susceptibility of deep neural networks to backdoor attacks. Given a backdoored model, its prediction of a poisoned sample with trigger will be dominated by the trigger information, though trigger information and benign information coexist. Inspired by the mechanism of the optical polarizer that a polarizer could pass light waves with particular polarizations while filtering light waves with other polarizations, we propose a novel backdoor defense method by inserting a learnable neural polarizer into the backdoored model as an intermediate layer, in order to purify the poisoned sample via filtering trigger information while maintaining benign information. The neural polarizer is instantiated as one lightweight linear transformation layer, which is learned through solving a well designed bi-level optimization problem, based on a limited clean dataset. Compared to other fine-tuning-based defense methods which often adjust all parameters of the backdoored model, the proposed method only needs to learn one additional layer, such that it is more efficient and requires less clean data. Extensive experiments demonstrate the effectiveness and efficiency of our method in removing backdoors across various neural network architectures and datasets, especially in the case of very limited clean data.

Via

Access Paper or Ask Questions

Lower Bounds on the Bayesian Risk via Information Measures

Mar 24, 2023
Amedeo Roberto Esposito, Adrien Vandenbroucque, Michael Gastpar

Figure 1 for Lower Bounds on the Bayesian Risk via Information Measures

Figure 2 for Lower Bounds on the Bayesian Risk via Information Measures

Figure 3 for Lower Bounds on the Bayesian Risk via Information Measures

Figure 4 for Lower Bounds on the Bayesian Risk via Information Measures

This paper focuses on parameter estimation and introduces a new method for lower bounding the Bayesian risk. The method allows for the use of virtually \emph{any} information measure, including R\'enyi's $\alpha$, $\varphi$-Divergences, and Sibson's $\alpha$-Mutual Information. The approach considers divergences as functionals of measures and exploits the duality between spaces of measures and spaces of functions. In particular, we show that one can lower bound the risk with any information measure by upper bounding its dual via Markov's inequality. We are thus able to provide estimator-independent impossibility results thanks to the Data-Processing Inequalities that divergences satisfy. The results are then applied to settings of interest involving both discrete and continuous parameters, including the ``Hide-and-Seek'' problem, and compared to the state-of-the-art techniques. An important observation is that the behaviour of the lower bound in the number of samples is influenced by the choice of the information measure. We leverage this by introducing a new divergence inspired by the ``Hockey-Stick'' Divergence, which is demonstrated empirically to provide the largest lower-bound across all considered settings. If the observations are subject to privatisation, stronger impossibility results can be obtained via Strong Data-Processing Inequalities. The paper also discusses some generalisations and alternative directions.

Via

Access Paper or Ask Questions

A Survey and Approach to Chart Classification

Jul 09, 2023
Anurag Dhote, Mohammed Javed, David S Doermann

Figure 1 for A Survey and Approach to Chart Classification

Figure 2 for A Survey and Approach to Chart Classification

Figure 3 for A Survey and Approach to Chart Classification

Figure 4 for A Survey and Approach to Chart Classification

Charts represent an essential source of visual information in documents and facilitate a deep understanding and interpretation of information typically conveyed numerically. In the scientific literature, there are many charts, each with its stylistic differences. Recently the document understanding community has begun to address the problem of automatic chart understanding, which begins with chart classification. In this paper, we present a survey of the current state-of-the-art techniques for chart classification and discuss the available datasets and their supported chart types. We broadly classify these contributions as traditional approaches based on ML, CNN, and Transformers. Furthermore, we carry out an extensive comparative performance analysis of CNN-based and transformer-based approaches on the recently published CHARTINFO UB-UNITECH PMC dataset for the CHART-Infographics competition at ICPR 2022. The data set includes 15 different chart categories, including 22,923 training images and 13,260 test images. We have implemented a vision-based transformer model that produces state-of-the-art results in chart classification.

* Accepted in 15th IAPR Workshop on Graphics Recognition (GREC) 2023 in conjunction with 17th International Conference on Document Analysis and Recognition (ICDAR) 2023, August 21-26, 2023 San Jose, USA

Via

Access Paper or Ask Questions

Transformer-based end-to-end classification of variable-length volumetric data

Jul 13, 2023
Marzieh Oghbaie, Teresa Araujo, Taha Emre, Ursula Schmidt-Erfurth, Hrvoje Bogunovic

Figure 1 for Transformer-based end-to-end classification of variable-length volumetric data

Figure 2 for Transformer-based end-to-end classification of variable-length volumetric data

Figure 3 for Transformer-based end-to-end classification of variable-length volumetric data

Figure 4 for Transformer-based end-to-end classification of variable-length volumetric data

The automatic classification of 3D medical data is memory-intensive. Also, variations in the number of slices between samples is common. Naive solutions such as subsampling can solve these problems, but at the cost of potentially eliminating relevant diagnosis information. Transformers have shown promising performance for sequential data analysis. However, their application for long-sequences is data, computationally, and memory demanding. In this paper, we propose an end-to-end Transformer-based framework that allows to classify volumetric data of variable length in an efficient fashion. Particularly, by randomizing the input slice-wise resolution during training, we enhance the capacity of the learnable positional embedding assigned to each volume slice. Consequently, the accumulated positional information in each positional embedding can be generalized to the neighbouring slices, even for high resolution volumes at the test time. By doing so, the model will be more robust to variable volume length and amenable to different computational budgets. We evaluated the proposed approach in retinal OCT volume classification and achieved 21.96% average improvement in balanced accuracy on a 9-class diagnostic task, compared to state-of-the-art video transformers. Our findings show that varying the slice-wise resolution of the input during training results in more informative volume representation as compared to training with fixed number of slices per volume. Our code is available at: https://github.com/marziehoghbaie/VLFAT.

Via

Access Paper or Ask Questions

BovineTalk: Machine Learning for Vocalization Analysis of Dairy Cattle under Negative Affective States

Jul 26, 2023
Dinu Gavojdian, Teddy Lazebnik, Madalina Mincu, Ariel Oren, Ioana Nicolae, Anna Zamansky

Figure 1 for BovineTalk: Machine Learning for Vocalization Analysis of Dairy Cattle under Negative Affective States

Figure 2 for BovineTalk: Machine Learning for Vocalization Analysis of Dairy Cattle under Negative Affective States

Figure 3 for BovineTalk: Machine Learning for Vocalization Analysis of Dairy Cattle under Negative Affective States

Figure 4 for BovineTalk: Machine Learning for Vocalization Analysis of Dairy Cattle under Negative Affective States

There is a critical need to develop and validate non-invasive animal-based indicators of affective states in livestock species, in order to integrate them into on-farm assessment protocols, potentially via the use of precision livestock farming (PLF) tools. One such promising approach is the use of vocal indicators. The acoustic structure of vocalizations and their functions were extensively studied in important livestock species, such as pigs, horses, poultry and goats, yet cattle remain understudied in this context to date. Cows were shown to produce two types vocalizations: low-frequency calls (LF), produced with the mouth closed, or partially closed, for close distance contacts and open mouth emitted high-frequency calls (HF), produced for long distance communication, with the latter considered to be largely associated with negative affective states. Moreover, cattle vocalizations were shown to contain information on individuality across a wide range of contexts, both negative and positive. Nowadays, dairy cows are facing a series of negative challenges and stressors in a typical production cycle, making vocalizations during negative affective states of special interest for research. One contribution of this study is providing the largest to date pre-processed (clean from noises) dataset of lactating adult multiparous dairy cows during negative affective states induced by visual isolation challenges. Here we present two computational frameworks - deep learning based and explainable machine learning based, to classify high and low-frequency cattle calls, and individual cow voice recognition. Our models in these two frameworks reached 87.2% and 89.4% accuracy for LF and HF classification, with 68.9% and 72.5% accuracy rates for the cow individual identification, respectively.

Via

Access Paper or Ask Questions

Beyond Single-Feature Importance with ICECREAM

Jul 19, 2023
Michael Oesterle, Patrick Blöbaum, Atalanti A. Mastakouri, Elke Kirschbaum

Figure 1 for Beyond Single-Feature Importance with ICECREAM

Figure 2 for Beyond Single-Feature Importance with ICECREAM

Figure 3 for Beyond Single-Feature Importance with ICECREAM

Figure 4 for Beyond Single-Feature Importance with ICECREAM

Which set of features was responsible for a certain output of a machine learning model? Which components caused the failure of a cloud computing application? These are just two examples of questions we are addressing in this work by Identifying Coalition-based Explanations for Common and Rare Events in Any Model (ICECREAM). Specifically, we propose an information-theoretic quantitative measure for the influence of a coalition of variables on the distribution of a target variable. This allows us to identify which set of factors is essential to obtain a certain outcome, as opposed to well-established explainability and causal contribution analysis methods which can assign contributions only to individual factors and rank them by their importance. In experiments with synthetic and real-world data, we show that ICECREAM outperforms state-of-the-art methods for explainability and root cause analysis, and achieves impressive accuracy in both tasks.

Via

Access Paper or Ask Questions

Blind Image Quality Assessment Using Multi-Stream Architecture with Spatial and Channel Attention

Jul 19, 2023
Hassan Khalid, Nisar Ahmed

Figure 1 for Blind Image Quality Assessment Using Multi-Stream Architecture with Spatial and Channel Attention

Figure 2 for Blind Image Quality Assessment Using Multi-Stream Architecture with Spatial and Channel Attention

Figure 3 for Blind Image Quality Assessment Using Multi-Stream Architecture with Spatial and Channel Attention

Figure 4 for Blind Image Quality Assessment Using Multi-Stream Architecture with Spatial and Channel Attention

BIQA (Blind Image Quality Assessment) is an important field of study that evaluates images automatically. Although significant progress has been made, blind image quality assessment remains a difficult task since images vary in content and distortions. Most algorithms generate quality without emphasizing the important region of interest. In order to solve this, a multi-stream spatial and channel attention-based algorithm is being proposed. This algorithm generates more accurate predictions with a high correlation to human perceptual assessment by combining hybrid features from two different backbones, followed by spatial and channel attention to provide high weights to the region of interest. Four legacy image quality assessment datasets are used to validate the effectiveness of our proposed approach. Authentic and synthetic distortion image databases are used to demonstrate the effectiveness of the proposed method, and we show that it has excellent generalization properties with a particular focus on the perceptual foreground information.

Via

Access Paper or Ask Questions

Enhancing Cross-lingual Transfer via Phonemic Transcription Integration

Jul 10, 2023
Hoang H. Nguyen, Chenwei Zhang, Tao Zhang, Eugene Rohrbaugh, Philip S. Yu

Figure 1 for Enhancing Cross-lingual Transfer via Phonemic Transcription Integration

Figure 2 for Enhancing Cross-lingual Transfer via Phonemic Transcription Integration

Figure 3 for Enhancing Cross-lingual Transfer via Phonemic Transcription Integration

Figure 4 for Enhancing Cross-lingual Transfer via Phonemic Transcription Integration

Previous cross-lingual transfer methods are restricted to orthographic representation learning via textual scripts. This limitation hampers cross-lingual transfer and is biased towards languages sharing similar well-known scripts. To alleviate the gap between languages from different writing scripts, we propose PhoneXL, a framework incorporating phonemic transcriptions as an additional linguistic modality beyond the traditional orthographic transcriptions for cross-lingual transfer. Particularly, we propose unsupervised alignment objectives to capture (1) local one-to-one alignment between the two different modalities, (2) alignment via multi-modality contexts to leverage information from additional modalities, and (3) alignment via multilingual contexts where additional bilingual dictionaries are incorporated. We also release the first phonemic-orthographic alignment dataset on two token-level tasks (Named Entity Recognition and Part-of-Speech Tagging) among the understudied but interconnected Chinese-Japanese-Korean-Vietnamese (CJKV) languages. Our pilot study reveals phonemic transcription provides essential information beyond the orthography to enhance cross-lingual transfer and bridge the gap among CJKV languages, leading to consistent improvements on cross-lingual token-level tasks over orthographic-based multilingual PLMs.

* 11 pages,1 figure, 7 tables. To appear in Findings of ACL 2023

Via

Access Paper or Ask Questions

On Collaboration in Distributed Parameter Estimation with Resource Constraints

Jul 12, 2023
Yu-Zhen Janice Chen, Daniel S. Menasché, Don Towsley

Figure 1 for On Collaboration in Distributed Parameter Estimation with Resource Constraints

Figure 2 for On Collaboration in Distributed Parameter Estimation with Resource Constraints

Figure 3 for On Collaboration in Distributed Parameter Estimation with Resource Constraints

Figure 4 for On Collaboration in Distributed Parameter Estimation with Resource Constraints

We study sensor/agent data collection and collaboration policies for parameter estimation, accounting for resource constraints and correlation between observations collected by distinct sensors/agents. Specifically, we consider a group of sensors/agents each samples from different variables of a multivariate Gaussian distribution and has different estimation objectives, and we formulate a sensor/agent's data collection and collaboration policy design problem as a Fisher information maximization (or Cramer-Rao bound minimization) problem. When the knowledge of correlation between variables is available, we analytically identify two particular scenarios: (1) where the knowledge of the correlation between samples cannot be leveraged for collaborative estimation purposes and (2) where the optimal data collection policy involves investing scarce resources to collaboratively sample and transfer information that is not of immediate interest and whose statistics are already known, with the sole goal of increasing the confidence on the estimate of the parameter of interest. When the knowledge of certain correlation is unavailable but collaboration may still be worthwhile, we propose novel ways to apply multi-armed bandit algorithms to learn the optimal data collection and collaboration policy in our distributed parameter estimation problem and demonstrate that the proposed algorithms, DOUBLE-F, DOUBLE-Z, UCB-F, UCB-Z, are effective through simulations.

Via

Access Paper or Ask Questions

Diffusion idea exploration for art generation

Jul 11, 2023
Nikhil Verma

Figure 1 for Diffusion idea exploration for art generation

Figure 2 for Diffusion idea exploration for art generation

Figure 3 for Diffusion idea exploration for art generation

Figure 4 for Diffusion idea exploration for art generation

Cross-Modal learning tasks have picked up pace in recent times. With plethora of applications in diverse areas, generation of novel content using multiple modalities of data has remained a challenging problem. To address the same, various generative modelling techniques have been proposed for specific tasks. Novel and creative image generation is one important aspect for industrial application which could help as an arm for novel content generation. Techniques proposed previously used Generative Adversarial Network(GAN), autoregressive models and Variational Autoencoders (VAE) for accomplishing similar tasks. These approaches are limited in their capability to produce images guided by either text instructions or rough sketch images decreasing the overall performance of image generator. We used state of the art diffusion models to generate creative art by primarily leveraging text with additional support of rough sketches. Diffusion starts with a pattern of random dots and slowly converts that pattern into a design image using the guiding information fed into the model. Diffusion models have recently outperformed other generative models in image generation tasks using cross modal data as guiding information. The initial experiments for this task of novel image generation demonstrated promising qualitative results.

* Report Submitted for degree completion of Master of Science in Applied Computing at University of Toronto

Via

Access Paper or Ask Questions