Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Danushka Bollegala

A Neighbourhood-Aware Differential Privacy Mechanism for Static Word Embeddings

Sep 19, 2023

Danushka Bollegala, Shuichi Otake, Tomoya Machide, Ken-ichi Kawarabayashi

Figure 1 for A Neighbourhood-Aware Differential Privacy Mechanism for Static Word Embeddings

Figure 2 for A Neighbourhood-Aware Differential Privacy Mechanism for Static Word Embeddings

Figure 3 for A Neighbourhood-Aware Differential Privacy Mechanism for Static Word Embeddings

Figure 4 for A Neighbourhood-Aware Differential Privacy Mechanism for Static Word Embeddings

Abstract:We propose a Neighbourhood-Aware Differential Privacy (NADP) mechanism considering the neighbourhood of a word in a pretrained static word embedding space to determine the minimal amount of noise required to guarantee a specified privacy level. We first construct a nearest neighbour graph over the words using their embeddings, and factorise it into a set of connected components (i.e. neighbourhoods). We then separately apply different levels of Gaussian noise to the words in each neighbourhood, determined by the set of words in that neighbourhood. Experiments show that our proposed NADP mechanism consistently outperforms multiple previously proposed DP mechanisms such as Laplacian, Gaussian, and Mahalanobis in multiple downstream tasks, while guaranteeing higher levels of privacy.

* Accepted to IJCNLP-AACL 2023

Via

Access Paper or Ask Questions

The Impact of Debiasing on the Performance of Language Models in Downstream Tasks is Underestimated

Sep 16, 2023

Masahiro Kaneko, Danushka Bollegala, Naoaki Okazaki

Abstract:Pre-trained language models trained on large-scale data have learned serious levels of social biases. Consequently, various methods have been proposed to debias pre-trained models. Debiasing methods need to mitigate only discriminatory bias information from the pre-trained models, while retaining information that is useful for the downstream tasks. In previous research, whether useful information is retained has been confirmed by the performance of downstream tasks in debiased pre-trained models. On the other hand, it is not clear whether these benchmarks consist of data pertaining to social biases and are appropriate for investigating the impact of debiasing. For example in gender-related social biases, data containing female words (e.g. ``she, female, woman''), male words (e.g. ``he, male, man''), and stereotypical words (e.g. ``nurse, doctor, professor'') are considered to be the most affected by debiasing. If there is not much data containing these words in a benchmark dataset for a target task, there is the possibility of erroneously evaluating the effects of debiasing. In this study, we compare the impact of debiasing on performance across multiple downstream tasks using a wide-range of benchmark datasets that containing female, male, and stereotypical words. Experiments show that the effects of debiasing are consistently \emph{underestimated} across all tasks. Moreover, the effects of debiasing could be reliably evaluated by separately considering instances containing female, male, and stereotypical words than all of the instances in a benchmark dataset.

* IJCNLP-AACL 2023

Via

Access Paper or Ask Questions

In-Contextual Bias Suppression for Large Language Models

Sep 13, 2023

Daisuke Oba, Masahiro Kaneko, Danushka Bollegala

Abstract:Despite their impressive performance in a wide range of NLP tasks, Large Language Models (LLMs) have been reported to encode worrying-levels of gender bias. Prior work has proposed debiasing methods that require human labelled examples, data augmentation and fine-tuning of the LLMs, which are computationally costly. Moreover, one might not even have access to the internal parameters for performing debiasing such as in the case of commercially available LLMs such as GPT-4. To address this challenge we propose bias suppression, a novel alternative to debiasing that does not require access to model parameters. We show that text-based preambles, generated from manually designed templates covering counterfactual statements, can accurately suppress gender biases in LLMs. Moreover, we find that descriptive sentences for occupations can further suppress gender biases. Interestingly, we find that bias suppression has a minimal adverse effect on downstream task performance, while effectively mitigating the gender biases.

* 13 pages

Via

Access Paper or Ask Questions

Learning to Predict Concept Ordering for Common Sense Generation

Sep 12, 2023

Tianhui Zhang, Danushka Bollegala, Bei Peng

Figure 1 for Learning to Predict Concept Ordering for Common Sense Generation

Figure 2 for Learning to Predict Concept Ordering for Common Sense Generation

Figure 3 for Learning to Predict Concept Ordering for Common Sense Generation

Figure 4 for Learning to Predict Concept Ordering for Common Sense Generation

Abstract:Prior work has shown that the ordering in which concepts are shown to a commonsense generator plays an important role, affecting the quality of the generated sentence. However, it remains a challenge to determine the optimal ordering of a given set of concepts such that a natural sentence covering all the concepts could be generated from a pretrained generator. To understand the relationship between the ordering of the input concepts and the quality of the generated sentences, we conduct a systematic study considering multiple language models (LMs) and concept ordering strategies. We find that BART-large model consistently outperforms all other LMs considered in this study when fine-tuned using the ordering of concepts as they appear in CommonGen training data as measured using multiple evaluation metrics. Moreover, the larger GPT3-based large language models (LLMs) variants do not necessarily outperform much smaller LMs on this task, even when fine-tuned on task-specific training data. Interestingly, human annotators significantly reorder input concept sets when manually writing sentences covering those concepts, and this ordering provides the best sentence generations independently of the LM used for the generation, outperforming a probabilistic concept ordering baseline

* 10 pages

Via

Access Paper or Ask Questions

Learn from Incomplete Tactile Data: Tactile Representation Learning with Masked Autoencoders

Jul 14, 2023

Guanqun Cao, Jiaqi Jiang, Danushka Bollegala, Shan Luo

Figure 1 for Learn from Incomplete Tactile Data: Tactile Representation Learning with Masked Autoencoders

Figure 2 for Learn from Incomplete Tactile Data: Tactile Representation Learning with Masked Autoencoders

Figure 3 for Learn from Incomplete Tactile Data: Tactile Representation Learning with Masked Autoencoders

Figure 4 for Learn from Incomplete Tactile Data: Tactile Representation Learning with Masked Autoencoders

Abstract:The missing signal caused by the objects being occluded or an unstable sensor is a common challenge during data collection. Such missing signals will adversely affect the results obtained from the data, and this issue is observed more frequently in robotic tactile perception. In tactile perception, due to the limited working space and the dynamic environment, the contact between the tactile sensor and the object is frequently insufficient and unstable, which causes the partial loss of signals, thus leading to incomplete tactile data. The tactile data will therefore contain fewer tactile cues with low information density. In this paper, we propose a tactile representation learning method, named TacMAE, based on Masked Autoencoder to address the problem of incomplete tactile data in tactile perception. In our framework, a portion of the tactile image is masked out to simulate the missing contact region. By reconstructing the missing signals in the tactile image, the trained model can achieve a high-level understanding of surface geometry and tactile properties from limited tactile cues. The experimental results of tactile texture recognition show that our proposed TacMAE can achieve a high recognition accuracy of 71.4% in the zero-shot transfer and 85.8% after fine-tuning, which are 15.2% and 8.2% higher than the results without using masked modeling. The extensive experiments on YCB objects demonstrate the knowledge transferability of our proposed method and the potential to improve efficiency in tactile exploration.

* This paper is accepted at IROS 2023

Via

Access Paper or Ask Questions

Multimodal Zero-Shot Learning for Tactile Texture Recognition

Jun 22, 2023

Guanqun Cao, Jiaqi Jiang, Danushka Bollegala, Min Li, Shan Luo

Figure 1 for Multimodal Zero-Shot Learning for Tactile Texture Recognition

Figure 2 for Multimodal Zero-Shot Learning for Tactile Texture Recognition

Figure 3 for Multimodal Zero-Shot Learning for Tactile Texture Recognition

Figure 4 for Multimodal Zero-Shot Learning for Tactile Texture Recognition

Abstract:Tactile sensing plays an irreplaceable role in robotic material recognition. It enables robots to distinguish material properties such as their local geometry and textures, especially for materials like textiles. However, most tactile recognition methods can only classify known materials that have been touched and trained with tactile data, yet cannot classify unknown materials that are not trained with tactile data. To solve this problem, we propose a tactile zero-shot learning framework to recognise unknown materials when they are touched for the first time without requiring training tactile samples. The visual modality, providing tactile cues from sight, and semantic attributes, giving high-level characteristics, are combined together to bridge the gap between touched classes and untouched classes. A generative model is learnt to synthesise tactile features according to corresponding visual images and semantic embeddings, and then a classifier can be trained using the synthesised tactile features of untouched materials for zero-shot recognition. Extensive experiments demonstrate that our proposed multimodal generative model can achieve a high recognition accuracy of 83.06% in classifying materials that were not touched before. The robotic experiment demo and the dataset are available at https://sites.google.com/view/multimodalzsl.

* Under review at Robotics and Autonomous Systems

Via

Access Paper or Ask Questions

Together We Make Sense -- Learning Meta-Sense Embeddings from Pretrained Static Sense Embeddings

May 30, 2023

Haochen Luo, Yi Zhou, Danushka Bollegala

Figure 1 for Together We Make Sense -- Learning Meta-Sense Embeddings from Pretrained Static Sense Embeddings

Figure 2 for Together We Make Sense -- Learning Meta-Sense Embeddings from Pretrained Static Sense Embeddings

Figure 3 for Together We Make Sense -- Learning Meta-Sense Embeddings from Pretrained Static Sense Embeddings

Figure 4 for Together We Make Sense -- Learning Meta-Sense Embeddings from Pretrained Static Sense Embeddings

Abstract:Sense embedding learning methods learn multiple vectors for a given ambiguous word, corresponding to its different word senses. For this purpose, different methods have been proposed in prior work on sense embedding learning that use different sense inventories, sense-tagged corpora and learning methods. However, not all existing sense embeddings cover all senses of ambiguous words equally well due to the discrepancies in their training resources. To address this problem, we propose the first-ever meta-sense embedding method -- Neighbour Preserving Meta-Sense Embeddings, which learns meta-sense embeddings by combining multiple independently trained source sense embeddings such that the sense neighbourhoods computed from the source embeddings are preserved in the meta-embedding space. Our proposed method can combine source sense embeddings that cover different sets of word senses. Experimental results on Word Sense Disambiguation (WSD) and Word-in-Context (WiC) tasks show that the proposed meta-sense embedding method consistently outperforms several competitive baselines.

* Accepted to Findings of ACL 2023

Via

Access Paper or Ask Questions

Metrics for quantifying isotropy in high dimensional unsupervised clustering tasks in a materials context

May 25, 2023

Samantha Durdy, Michael W. Gaultois, Vladimir Gusev, Danushka Bollegala, Matthew J. Rosseinsky

Figure 1 for Metrics for quantifying isotropy in high dimensional unsupervised clustering tasks in a materials context

Figure 2 for Metrics for quantifying isotropy in high dimensional unsupervised clustering tasks in a materials context

Figure 3 for Metrics for quantifying isotropy in high dimensional unsupervised clustering tasks in a materials context

Figure 4 for Metrics for quantifying isotropy in high dimensional unsupervised clustering tasks in a materials context

Abstract:Clustering is a common task in machine learning, but clusters of unlabelled data can be hard to quantify. The application of clustering algorithms in chemistry is often dependant on material representation. Ascertaining the effects of different representations, clustering algorithms, or data transformations on the resulting clusters is difficult due to the dimensionality of these data. We present a thorough analysis of measures for isotropy of a cluster, including a novel implantation based on an existing derivation. Using fractional anisotropy, a common method used in medical imaging for comparison, we then expand these measures to examine the average isotropy of a set of clusters. A use case for such measures is demonstrated by quantifying the effects of kernel approximation functions on different representations of the Inorganic Crystal Structure Database. Broader applicability of these methods is demonstrated in analysing learnt embedding of the MNIST dataset. Random clusters are explored to examine the differences between isotropy measures presented, and to see how each method scales with the dimensionality. Python implementations of these measures are provided for use by the community.

* 31 pages, 6 figures

Via

Access Paper or Ask Questions

Solving Cosine Similarity Underestimation between High Frequency Words by L2 Norm Discounting

May 17, 2023

Saeth Wannasuphoprasit, Yi Zhou, Danushka Bollegala

Abstract:Cosine similarity between two words, computed using their contextualised token embeddings obtained from masked language models (MLMs) such as BERT has shown to underestimate the actual similarity between those words (Zhou et al., 2022). This similarity underestimation problem is particularly severe for highly frequent words. Although this problem has been noted in prior work, no solution has been proposed thus far. We observe that the L2 norm of contextualised embeddings of a word correlates with its log-frequency in the pretraining corpus. Consequently, the larger L2 norms associated with the highly frequent words reduce the cosine similarity values measured between them, thus underestimating the similarity scores. To solve this issue, we propose a method to discount the L2 norm of a contextualised word embedding by the frequency of that word in a corpus when measuring the cosine similarities between words. We show that the so called stop words behave differently from the rest of the words, which require special consideration during their discounting process. Experimental results on a contextualised word similarity dataset show that our proposed discounting method accurately solves the similarity underestimation problem.

* 7 pages, 5 figures. To be published in the Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, 9-14 July 2023, Toronto, Canada

Via

Access Paper or Ask Questions

Unsupervised Semantic Variation Prediction using the Distribution of Sibling Embeddings

May 15, 2023

Taichi Aida, Danushka Bollegala

Figure 1 for Unsupervised Semantic Variation Prediction using the Distribution of Sibling Embeddings

Figure 2 for Unsupervised Semantic Variation Prediction using the Distribution of Sibling Embeddings

Figure 3 for Unsupervised Semantic Variation Prediction using the Distribution of Sibling Embeddings

Figure 4 for Unsupervised Semantic Variation Prediction using the Distribution of Sibling Embeddings

Abstract:Languages are dynamic entities, where the meanings associated with words constantly change with time. Detecting the semantic variation of words is an important task for various NLP applications that must make time-sensitive predictions. Existing work on semantic variation prediction have predominantly focused on comparing some form of an averaged contextualised representation of a target word computed from a given corpus. However, some of the previously associated meanings of a target word can become obsolete over time (e.g. meaning of gay as happy), while novel usages of existing words are observed (e.g. meaning of cell as a mobile phone). We argue that mean representations alone cannot accurately capture such semantic variations and propose a method that uses the entire cohort of the contextualised embeddings of the target word, which we refer to as the sibling distribution. Experimental results on SemEval-2020 Task 1 benchmark dataset for semantic variation prediction show that our method outperforms prior work that consider only the mean embeddings, and is comparable to the current state-of-the-art. Moreover, a qualitative analysis shows that our method detects important semantic changes in words that are not captured by the existing methods. Source code is available at https://github.com/a1da4/svp-gauss .

* Findings of ACL2023

Via

Access Paper or Ask Questions