Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michalis Vazirgiannis

Ecole Polytechnique, AUEB

Leveraging Discourse Structure for Extractive Meeting Summarization

May 21, 2024

Virgile Rennard, Guokan Shang, Michalis Vazirgiannis, Julie Hunter

Abstract:We introduce an extractive summarization system for meetings that leverages discourse structure to better identify salient information from complex multi-party discussions. Using discourse graphs to represent semantic relations between the contents of utterances in a meeting, we train a GNN-based node classification model to select the most important utterances, which are then combined to create an extractive summary. Experimental results on AMI and ICSI demonstrate that our approach surpasses existing text-based and graph-based extractive summarization systems, as measured by both classification and summarization metrics. Additionally, we conduct ablation studies on discourse structure and relation type to provide insights for future NLP applications leveraging discourse analysis theory.

Via

Access Paper or Ask Questions

Bounding the Expected Robustness of Graph Neural Networks Subject to Node Feature Attacks

Apr 27, 2024

Yassine Abbahaddou, Sofiane Ennadir, Johannes F. Lutzeyer, Michalis Vazirgiannis, Henrik Boström

Figure 1 for Bounding the Expected Robustness of Graph Neural Networks Subject to Node Feature Attacks

Figure 2 for Bounding the Expected Robustness of Graph Neural Networks Subject to Node Feature Attacks

Figure 3 for Bounding the Expected Robustness of Graph Neural Networks Subject to Node Feature Attacks

Figure 4 for Bounding the Expected Robustness of Graph Neural Networks Subject to Node Feature Attacks

Abstract:Graph Neural Networks (GNNs) have demonstrated state-of-the-art performance in various graph representation learning tasks. Recently, studies revealed their vulnerability to adversarial attacks. In this work, we theoretically define the concept of expected robustness in the context of attributed graphs and relate it to the classical definition of adversarial robustness in the graph representation learning literature. Our definition allows us to derive an upper bound of the expected robustness of Graph Convolutional Networks (GCNs) and Graph Isomorphism Networks subject to node feature attacks. Building on these findings, we connect the expected robustness of GNNs to the orthonormality of their weight matrices and consequently propose an attack-independent, more robust variant of the GCN, called the Graph Convolutional Orthonormal Robust Networks (GCORNs). We further introduce a probabilistic method to estimate the expected robustness, which allows us to evaluate the effectiveness of GCORN on several real-world datasets. Experimental experiments showed that GCORN outperforms available defense methods. Our code is publicly available at: \href{https://github.com/Sennadir/GCORN}{https://github.com/Sennadir/GCORN}.

* Accepted at ICLR 2024

Via

Access Paper or Ask Questions

Neural Graph Generator: Feature-Conditioned Graph Generation using Latent Diffusion Models

Mar 03, 2024

Iakovos Evdaimon, Giannis Nikolentzos, Michail Chatzianastasis, Hadi Abdine, Michalis Vazirgiannis

Abstract:Graph generation has emerged as a crucial task in machine learning, with significant challenges in generating graphs that accurately reflect specific properties. Existing methods often fall short in efficiently addressing this need as they struggle with the high-dimensional complexity and varied nature of graph properties. In this paper, we introduce the Neural Graph Generator (NGG), a novel approach which utilizes conditioned latent diffusion models for graph generation. NGG demonstrates a remarkable capacity to model complex graph patterns, offering control over the graph generation process. NGG employs a variational graph autoencoder for graph compression and a diffusion process in the latent vector space, guided by vectors summarizing graph statistics. We demonstrate NGG's versatility across various graph generation tasks, showing its capability to capture desired graph properties and generalize to unseen graphs. This work signifies a significant shift in graph generation methodologies, offering a more practical and efficient solution for generating diverse types of graphs with specific characteristics.

Via

Access Paper or Ask Questions

A Simple and Yet Fairly Effective Defense for Graph Neural Networks

Feb 21, 2024

Sofiane Ennadir, Yassine Abbahaddou, Johannes F. Lutzeyer, Michalis Vazirgiannis, Henrik Boström

Abstract:Graph Neural Networks (GNNs) have emerged as the dominant approach for machine learning on graph-structured data. However, concerns have arisen regarding the vulnerability of GNNs to small adversarial perturbations. Existing defense methods against such perturbations suffer from high time complexity and can negatively impact the model's performance on clean graphs. To address these challenges, this paper introduces NoisyGNNs, a novel defense method that incorporates noise into the underlying model's architecture. We establish a theoretical connection between noise injection and the enhancement of GNN robustness, highlighting the effectiveness of our approach. We further conduct extensive empirical evaluations on the node classification task to validate our theoretical findings, focusing on two popular GNNs: the GCN and GIN. The results demonstrate that NoisyGNN achieves superior or comparable defense performance to existing methods while minimizing added time complexity. The NoisyGNN approach is model-agnostic, allowing it to be integrated with different GNN architectures. Successful combinations of our NoisyGNN approach with existing defense techniques demonstrate even further improved adversarial defense results. Our code is publicly available at: https://github.com/Sennadir/NoisyGNN.

* Accepted at AAAI-24

Via

Access Paper or Ask Questions

Graph Neural Machine: A New Model for Learning with Tabular Data

Feb 05, 2024

Giannis Nikolentzos, Siyun Wang, Johannes Lutzeyer, Michalis Vazirgiannis

Abstract:In recent years, there has been a growing interest in mapping data from different domains to graph structures. Among others, neural network models such as the multi-layer perceptron (MLP) can be modeled as graphs. In fact, MLPs can be represented as directed acyclic graphs. Graph neural networks (GNNs) have recently become the standard tool for performing machine learning tasks on graphs. In this work, we show that an MLP is equivalent to an asynchronous message passing GNN model which operates on the MLP's graph representation. We then propose a new machine learning model for tabular data, the so-called Graph Neural Machine (GNM), which replaces the MLP's directed acyclic graph with a nearly complete graph and which employs a synchronous message passing scheme. We show that a single GNM model can simulate multiple MLP models. We evaluate the proposed model in several classification and regression datasets. In most cases, the GNM model outperforms the MLP architecture.

Via

Access Paper or Ask Questions

FREDSum: A Dialogue Summarization Corpus for French Political Debates

Dec 08, 2023

Virgile Rennard, Guokan Shang, Damien Grari, Julie Hunter, Michalis Vazirgiannis

Abstract:Recent advances in deep learning, and especially the invention of encoder-decoder architectures, has significantly improved the performance of abstractive summarization systems. The majority of research has focused on written documents, however, neglecting the problem of multi-party dialogue summarization. In this paper, we present a dataset of French political debates for the purpose of enhancing resources for multi-lingual dialogue summarization. Our dataset consists of manually transcribed and annotated political debates, covering a range of topics and perspectives. We highlight the importance of high quality transcription and annotations for training accurate and effective dialogue summarization models, and emphasize the need for multilingual resources to support dialogue summarization in non-English languages. We also provide baseline experiments using state-of-the-art methods, and encourage further research in this area to advance the field of dialogue summarization. Our dataset will be made publicly available for use by the research community.

* Accepted at EMNLP2023 Findings

Via

Access Paper or Ask Questions

Heuristics for Detecting CoinJoin Transactions on the Bitcoin Blockchain

Nov 21, 2023

Hugo Schnoering, Michalis Vazirgiannis

Figure 1 for Heuristics for Detecting CoinJoin Transactions on the Bitcoin Blockchain

Figure 2 for Heuristics for Detecting CoinJoin Transactions on the Bitcoin Blockchain

Figure 3 for Heuristics for Detecting CoinJoin Transactions on the Bitcoin Blockchain

Figure 4 for Heuristics for Detecting CoinJoin Transactions on the Bitcoin Blockchain

Abstract:This research delves into the intricacies of Bitcoin, a decentralized peer-to-peer network, and its associated blockchain, which records all transactions since its inception. While this ensures integrity and transparency, the transparent nature of Bitcoin potentially compromises users' privacy rights. To address this concern, users have adopted CoinJoin, a method that amalgamates multiple transaction intents into a single, larger transaction to bolster transactional privacy. This process complicates individual transaction tracing and disrupts many established blockchain analysis heuristics. Despite its significance, limited research has been conducted on identifying CoinJoin transactions. Particularly noteworthy are varied CoinJoin implementations such as JoinMarket, Wasabi, and Whirlpool, each presenting distinct challenges due to their unique transaction structures. This study delves deeply into the open-source implementations of these protocols, aiming to develop refined heuristics for identifying their transactions on the blockchain. Our exhaustive analysis covers transactions up to block 760,000, offering a comprehensive insight into CoinJoin transactions and their implications for Bitcoin blockchain analysis.

Via

Access Paper or Ask Questions

Automatic Analysis of Substantiation in Scientific Peer Reviews

Nov 20, 2023

Yanzhu Guo, Guokan Shang, Virgile Rennard, Michalis Vazirgiannis, Chloé Clavel

Abstract:With the increasing amount of problematic peer reviews in top AI conferences, the community is urgently in need of automatic quality control measures. In this paper, we restrict our attention to substantiation -- one popular quality aspect indicating whether the claims in a review are sufficiently supported by evidence -- and provide a solution automatizing this evaluation process. To achieve this goal, we first formulate the problem as claim-evidence pair extraction in scientific peer reviews, and collect SubstanReview, the first annotated dataset for this task. SubstanReview consists of 550 reviews from NLP conferences annotated by domain experts. On the basis of this dataset, we train an argument mining system to automatically analyze the level of substantiation in peer reviews. We also perform data analysis on the SubstanReview dataset to obtain meaningful insights on peer reviewing quality in NLP conferences over recent years.

* Accepted to EMNLP 2023 Findings

Via

Access Paper or Ask Questions

The Curious Decline of Linguistic Diversity: Training Language Models on Synthetic Text

Nov 16, 2023

Yanzhu Guo, Guokan Shang, Michalis Vazirgiannis, Chloé Clavel

Abstract:This study investigates the consequences of training large language models (LLMs) on synthetic data generated by their predecessors, an increasingly prevalent practice aimed at addressing the limited supply of human-generated training data. Diverging from the usual emphasis on performance metrics, we focus on the impact of this training methodology on linguistic diversity, especially when conducted recursively over time. To assess this, we developed a set of novel metrics targeting lexical, syntactic, and semantic diversity, applying them in recursive fine-tuning experiments across various natural language generation tasks. Our findings reveal a marked decrease in the diversity of the models' outputs through successive iterations. This trend underscores the potential risks of training LLMs on predecessor-generated text, particularly concerning the preservation of linguistic richness. Our study highlights the need for careful consideration of the long-term effects of such training approaches on the linguistic capabilities of LLMs.

* Work in progress

Via

Access Paper or Ask Questions

Interpretable Graph Neural Networks for Tabular Data

Aug 17, 2023

Amr Alkhatib, Sofiane Ennadir, Henrik Boström, Michalis Vazirgiannis

Figure 1 for Interpretable Graph Neural Networks for Tabular Data

Figure 2 for Interpretable Graph Neural Networks for Tabular Data

Figure 3 for Interpretable Graph Neural Networks for Tabular Data

Figure 4 for Interpretable Graph Neural Networks for Tabular Data

Abstract:Data in tabular format is frequently occurring in real-world applications. Graph Neural Networks (GNNs) have recently been extended to effectively handle such data, allowing feature interactions to be captured through representation learning. However, these approaches essentially produce black-box models, in the form of deep neural networks, precluding users from following the logic behind the model predictions. We propose an approach, called IGNNet (Interpretable Graph Neural Network for tabular data), which constrains the learning algorithm to produce an interpretable model, where the model shows how the predictions are exactly computed from the original input features. A large-scale empirical investigation is presented, showing that IGNNet is performing on par with state-of-the-art machine-learning algorithms that target tabular data, including XGBoost, Random Forests, and TabNet. At the same time, the results show that the explanations obtained from IGNNet are aligned with the true Shapley values of the features without incurring any additional computational overhead.

* 18 pages, 12 figures

Via

Access Paper or Ask Questions