Alert button
Picture for Roger Wattenhofer

Roger Wattenhofer

Alert button

Towards Robust Graph Contrastive Learning

Feb 25, 2021
Nikola Jovanović, Zhao Meng, Lukas Faber, Roger Wattenhofer

Figure 1 for Towards Robust Graph Contrastive Learning
Figure 2 for Towards Robust Graph Contrastive Learning
Figure 3 for Towards Robust Graph Contrastive Learning
Figure 4 for Towards Robust Graph Contrastive Learning

We study the problem of adversarially robust self-supervised learning on graphs. In the contrastive learning framework, we introduce a new method that increases the adversarial robustness of the learned representations through i) adversarial transformations and ii) transformations that not only remove but also insert edges. We evaluate the learned representations in a preliminary set of experiments, obtaining promising results. We believe this work takes an important step towards incorporating robustness as a viable auxiliary task in graph contrastive learning.

Viaarxiv icon

Of Non-Linearity and Commutativity in BERT

Jan 14, 2021
Sumu Zhao, Damian Pascual, Gino Brunner, Roger Wattenhofer

Figure 1 for Of Non-Linearity and Commutativity in BERT
Figure 2 for Of Non-Linearity and Commutativity in BERT
Figure 3 for Of Non-Linearity and Commutativity in BERT
Figure 4 for Of Non-Linearity and Commutativity in BERT

In this work we provide new insights into the transformer architecture, and in particular, its best-known variant, BERT. First, we propose a method to measure the degree of non-linearity of different elements of transformers. Next, we focus our investigation on the feed-forward networks (FFN) inside transformers, which contain 2/3 of the model parameters and have so far not received much attention. We find that FFNs are an inefficient yet important architectural element and that they cannot simply be replaced by attention blocks without a degradation in performance. Moreover, we study the interactions between layers in BERT and show that, while the layers exhibit some hierarchical structure, they extract features in a fuzzy manner. Our results suggest that BERT has an inductive bias towards layer commutativity, which we find is mainly due to the skip connections. This provides a justification for the strong performance of recurrent and weight-shared transformer models.

Viaarxiv icon

KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation

Jan 02, 2021
Yiran Xing, Zai Shi, Zhao Meng, Yunpu Ma, Roger Wattenhofer

Figure 1 for KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
Figure 2 for KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
Figure 3 for KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation
Figure 4 for KM-BART: Knowledge Enhanced Multimodal BART for Visual Commonsense Generation

We present Knowledge Enhanced Multimodal BART (KM-BART), which is a Transformer-based sequence-to-sequence model capable of reasoning about commonsense knowledge from multimodal inputs of images and texts. We extend the popular BART architecture to a multi-modal model. We design a new pretraining task to improve the model performance on Visual Commonsense Generation task. Our pretraining task improves the Visual Commonsense Generation performance by leveraging knowledge from a large language model pretrained on an external knowledge graph. To the best of our knowledge, we are the first to propose a dedicated task for improving model performance on Visual Commonsense Generation. Experimental results show that by pretraining, our model reaches state-of-the-art performance on the Visual Commonsense Generation task.

* Work in progress. The first three authors contribute equally to this work 
Viaarxiv icon

Directed Beam Search: Plug-and-Play Lexically Constrained Language Generation

Dec 31, 2020
Damian Pascual, Beni Egressy, Florian Bolli, Roger Wattenhofer

Figure 1 for Directed Beam Search: Plug-and-Play Lexically Constrained Language Generation
Figure 2 for Directed Beam Search: Plug-and-Play Lexically Constrained Language Generation
Figure 3 for Directed Beam Search: Plug-and-Play Lexically Constrained Language Generation
Figure 4 for Directed Beam Search: Plug-and-Play Lexically Constrained Language Generation

Large pre-trained language models are capable of generating realistic text. However, controlling these models so that the generated text satisfies lexical constraints, i.e., contains specific words, is a challenging problem. Given that state-of-the-art language models are too large to be trained from scratch in a manageable time, it is desirable to control these models without re-training them. Methods capable of doing this are called plug-and-play. Recent plug-and-play methods have been successful in constraining small bidirectional language models as well as forward models in tasks with a restricted search space, e.g., machine translation. However, controlling large transformer-based models to meet lexical constraints without re-training them remains a challenge. In this work, we propose Directed Beam Search (DBS), a plug-and-play method for lexically constrained language generation. Our method can be applied to any language model, is easy to implement and can be used for general language generation. In our experiments we use DBS to control GPT-2. We demonstrate its performance on keyword-to-phrase generation and we obtain comparable results as a state-of-the-art non-plug-and-play model for lexically constrained story generation.

* Preprint. Work in progress 
Viaarxiv icon

Contrastive Graph Neural Network Explanation

Oct 26, 2020
Lukas Faber, Amin K. Moghaddam, Roger Wattenhofer

Figure 1 for Contrastive Graph Neural Network Explanation
Figure 2 for Contrastive Graph Neural Network Explanation
Figure 3 for Contrastive Graph Neural Network Explanation
Figure 4 for Contrastive Graph Neural Network Explanation

Graph Neural Networks achieve remarkable results on problems with structured data but come as black-box predictors. Transferring existing explanation techniques, such as occlusion, fails as even removing a single node or edge can lead to drastic changes in the graph. The resulting graphs can differ from all training examples, causing model confusion and wrong explanations. Thus, we argue that explicability must use graphs compliant with the distribution underlying the training data. We coin this property Distribution Compliant Explanation (DCE) and present a novel Contrastive GNN Explanation (CoGE) technique following this paradigm. An experimental study supports the efficacy of CoGE.

* ICML 2020 Workshop on Graph Representation Learning and Beyond (GRL+) 
Viaarxiv icon

Brain2Word: Decoding Brain Activity for Language Generation

Oct 13, 2020
Nicolas Affolter, Beni Egressy, Damian Pascual, Roger Wattenhofer

Figure 1 for Brain2Word: Decoding Brain Activity for Language Generation
Figure 2 for Brain2Word: Decoding Brain Activity for Language Generation
Figure 3 for Brain2Word: Decoding Brain Activity for Language Generation
Figure 4 for Brain2Word: Decoding Brain Activity for Language Generation

Brain decoding, understood as the process of mapping brain activities to the stimuli that generated them, has been an active research area in the last years. In the case of language stimuli, recent studies have shown that it is possible to decode fMRI scans into an embedding of the word a subject is reading. However, such word embeddings are designed for natural language processing tasks rather than for brain decoding. Therefore, they limit our ability to recover the precise stimulus. In this work, we propose to directly classify an fMRI scan, mapping it to the corresponding word within a fixed vocabulary. Unlike existing work, we evaluate on scans from previously unseen subjects. We argue that this is a more realistic setup and we present a model that can decode fMRI data from unseen subjects. Our model achieves 5.22% Top-1 and 13.59% Top-5 accuracy in this challenging task, significantly outperforming all the considered competitive baselines. Furthermore, we use the decoded words to guide language generation with the GPT-2 model. This way, we advance the quest for a system that translates brain activities into coherent text.

Viaarxiv icon

A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples

Oct 03, 2020
Zhao Meng, Roger Wattenhofer

Figure 1 for A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples
Figure 2 for A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples
Figure 3 for A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples
Figure 4 for A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples

Generating adversarial examples for natural language is hard, as natural language consists of discrete symbols, and examples are often of variable lengths. In this paper, we propose a geometry-inspired attack for generating natural language adversarial examples. Our attack generates adversarial examples by iteratively approximating the decision boundary of Deep Neural Networks (DNNs). Experiments on two datasets with two different models show that our attack fools natural language models with high success rates, while only replacing a few words. Human evaluation shows that adversarial examples generated by our attack are hard for humans to recognize. Further experiments show that adversarial training can improve model robustness against our attack.

* COLING 2020 Long Paper 
Viaarxiv icon

Medley2K: A Dataset of Medley Transitions

Aug 25, 2020
Lukas Faber, Sandro Luck, Damian Pascual, Andreas Roth, Gino Brunner, Roger Wattenhofer

Figure 1 for Medley2K: A Dataset of Medley Transitions
Figure 2 for Medley2K: A Dataset of Medley Transitions

The automatic generation of medleys, i.e., musical pieces formed by different songs concatenated via smooth transitions, is not well studied in the current literature. To facilitate research on this topic, we make available a dataset called Medley2K that consists of 2,000 medleys and 7,712 labeled transitions. Our dataset features a rich variety of song transitions across different music genres. We provide a detailed description of this dataset and validate it by training a state-of-the-art generative model in the task of generating transitions between songs.

* MML 2020 - 13th Int. Workshop on Machine Learning and Music at ECML-PKDD 2020 
Viaarxiv icon