Alert button
Picture for Bernardo Magnini

Bernardo Magnini

Alert button

Unraveling ChatGPT: A Critical Analysis of AI-Generated Goal-Oriented Dialogues and Annotations

May 23, 2023
Tiziano Labruna, Sofia Brenna, Andrea Zaninello, Bernardo Magnini

Figure 1 for Unraveling ChatGPT: A Critical Analysis of AI-Generated Goal-Oriented Dialogues and Annotations
Figure 2 for Unraveling ChatGPT: A Critical Analysis of AI-Generated Goal-Oriented Dialogues and Annotations
Figure 3 for Unraveling ChatGPT: A Critical Analysis of AI-Generated Goal-Oriented Dialogues and Annotations

Large pre-trained language models have exhibited unprecedented capabilities in producing high-quality text via prompting techniques. This fact introduces new possibilities for data collection and annotation, particularly in situations where such data is scarce, complex to gather, expensive, or even sensitive. In this paper, we explore the potential of these models to generate and annotate goal-oriented dialogues, and conduct an in-depth analysis to evaluate their quality. Our experiments employ ChatGPT, and encompass three categories of goal-oriented dialogues (task-oriented, collaborative, and explanatory), two generation modes (interactive and one-shot), and two languages (English and Italian). Based on extensive human-based evaluations, we demonstrate that the quality of generated dialogues and annotations is on par with those generated by humans.

Viaarxiv icon

Recent Neural Methods on Slot Filling and Intent Classification for Task-Oriented Dialogue Systems: A Survey

Nov 01, 2020
Samuel Louvan, Bernardo Magnini

Figure 1 for Recent Neural Methods on Slot Filling and Intent Classification for Task-Oriented Dialogue Systems: A Survey
Figure 2 for Recent Neural Methods on Slot Filling and Intent Classification for Task-Oriented Dialogue Systems: A Survey
Figure 3 for Recent Neural Methods on Slot Filling and Intent Classification for Task-Oriented Dialogue Systems: A Survey
Figure 4 for Recent Neural Methods on Slot Filling and Intent Classification for Task-Oriented Dialogue Systems: A Survey

In recent years, fostered by deep learning technologies and by the high demand for conversational AI, various approaches have been proposed that address the capacity to elicit and understand user's needs in task-oriented dialogue systems. We focus on two core tasks, slot filling (SF) and intent classification (IC), and survey how neural-based models have rapidly evolved to address natural language understanding in dialogue systems. We introduce three neural architectures: independent model, which model SF and IC separately, joint models, which exploit the mutual benefit of the two tasks simultaneously, and transfer learning models, that scale the model to new domains. We discuss the current state of the research in SF and IC and highlight challenges that still require attention.

* COLING 2020 
Viaarxiv icon

Simple is Better! Lightweight Data Augmentation for Low Resource Slot Filling and Intent Classification

Sep 08, 2020
Samuel Louvan, Bernardo Magnini

Figure 1 for Simple is Better! Lightweight Data Augmentation for Low Resource Slot Filling and Intent Classification
Figure 2 for Simple is Better! Lightweight Data Augmentation for Low Resource Slot Filling and Intent Classification
Figure 3 for Simple is Better! Lightweight Data Augmentation for Low Resource Slot Filling and Intent Classification
Figure 4 for Simple is Better! Lightweight Data Augmentation for Low Resource Slot Filling and Intent Classification

Neural-based models have achieved outstanding performance on slot filling and intent classification, when fairly large in-domain training data are available. However, as new domains are frequently added, creating sizeable data is expensive. We show that lightweight augmentation, a set of augmentation methods involving word span and sentence level operations, alleviates data scarcity problems. Our experiments on limited data settings show that lightweight augmentation yields significant performance improvement on slot filling on the ATIS and SNIPS datasets, and achieves competitive performance with respect to more complex, state-of-the-art, augmentation approaches. Furthermore, lightweight augmentation is also beneficial when combined with pre-trained LM-based models, as it improves BERT-based joint intent and slot filling models.

* Accepted at PACLIC 2020 - The 34th Pacific Asia Conference on Language, Information and Computation 
Viaarxiv icon

The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

Mar 30, 2020
Georg Rehm, Katrin Marheinecke, Stefanie Hegele, Stelios Piperidis, Kalina Bontcheva, Jan Hajič, Khalid Choukri, Andrejs Vasiļjevs, Gerhard Backfried, Christoph Prinz, José Manuel Gómez Pérez, Luc Meertens, Paul Lukowicz, Josef van Genabith, Andrea Lösch, Philipp Slusallek, Morten Irgens, Patrick Gatellier, Joachim Köhler, Laure Le Bars, Dimitra Anastasiou, Albina Auksoriūtė, Núria Bel, António Branco, Gerhard Budin, Walter Daelemans, Koenraad De Smedt, Radovan Garabík, Maria Gavriilidou, Dagmar Gromann, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lindén, Bernardo Magnini, Jan Odijk, Maciej Ogrodniczuk, Eiríkur Rögnvaldsson, Mike Rosner, Bolette Sandford Pedersen, Inguna Skadiņa, Marko Tadić, Dan Tufiş, Tamás Váradi, Kadri Vider, Andy Way, François Yvon

Figure 1 for The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe
Figure 2 for The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality. However, language barriers impacting business, cross-lingual and cross-cultural communication are still omnipresent. Language Technologies (LTs) are a powerful means to break down these barriers. While the last decade has seen various initiatives that created a multitude of approaches and technologies tailored to Europe's specific needs, there is still an immense level of fragmentation. At the same time, AI has become an increasingly important concept in the European Information and Communication Technology area. For a few years now, AI, including many opportunities, synergies but also misconceptions, has been overshadowing every other topic. We present an overview of the European LT landscape, describing funding programmes, activities, actions and challenges in the different countries with regard to LT, including the current state of play in industry and the LT market. We present a brief overview of the main LT-related activities on the EU level in the last ten years and develop strategic guidance with regard to four key dimensions.

* Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020). To appear 
Viaarxiv icon

Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems

Jan 21, 2020
Vevake Balaraman, Bernardo Magnini

Figure 1 for Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems
Figure 2 for Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems
Figure 3 for Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems
Figure 4 for Domain-Aware Dialogue State Tracker for Multi-Domain Dialogue Systems

In task-oriented dialogue systems the dialogue state tracker (DST) component is responsible for predicting the state of the dialogue based on the dialogue history. Current DST approaches rely on a predefined domain ontology, a fact that limits their effective usage for large scale conversational agents, where the DST constantly needs to be interfaced with ever-increasing services and APIs. Focused towards overcoming this drawback, we propose a domain-aware dialogue state tracker, that is completely data-driven and it is modeled to predict for dynamic service schemas. The proposed model utilizes domain and slot information to extract both domain and slot specific representations for a given dialogue, and then uses such representations to predict the values of the corresponding slot. Integrating this mechanism with a pretrained language model (i.e. BERT), our approach can effectively learn semantic relations.

Viaarxiv icon

A Robust Data-Driven Approach for Dialogue State Tracking of Unseen Slot Values

Nov 01, 2019
Vevake Balaraman, Bernardo Magnini

Figure 1 for A Robust Data-Driven Approach for Dialogue State Tracking of Unseen Slot Values
Figure 2 for A Robust Data-Driven Approach for Dialogue State Tracking of Unseen Slot Values
Figure 3 for A Robust Data-Driven Approach for Dialogue State Tracking of Unseen Slot Values
Figure 4 for A Robust Data-Driven Approach for Dialogue State Tracking of Unseen Slot Values

A Dialogue State Tracker is a key component in dialogue systems which estimates the beliefs of possible user goals at each dialogue turn. Deep learning approaches using recurrent neural networks have shown state-of-the-art performance for the task of dialogue state tracking. Generally, these approaches assume a predefined candidate list and struggle to predict any new dialogue state values that are not seen during training. This makes extending the candidate list for a slot without model retaining infeasible and also has limitations in modelling for low resource domains where training data for slot values are expensive. In this paper, we propose a novel dialogue state tracker based on copying mechanism that can effectively track such unseen slot values without compromising performance on slot values seen during training. The proposed model is also flexible in extending the candidate list without requiring any retraining or change in the model. We evaluate the proposed model on various benchmark datasets (DSTC2, DSTC3 and WoZ2.0) and show that our approach, outperform other end-to-end data-driven approaches in tracking unseen slot values and also provides significant advantages in modelling for DST.

Viaarxiv icon

Scalable Neural Dialogue State Tracking

Oct 22, 2019
Vevake Balaraman, Bernardo Magnini

Figure 1 for Scalable Neural Dialogue State Tracking
Figure 2 for Scalable Neural Dialogue State Tracking
Figure 3 for Scalable Neural Dialogue State Tracking
Figure 4 for Scalable Neural Dialogue State Tracking

A Dialogue State Tracker (DST) is a key component in a dialogue system aiming at estimating the beliefs of possible user goals at each dialogue turn. Most of the current DST trackers make use of recurrent neural networks and are based on complex architectures that manage several aspects of a dialogue, including the user utterance, the system actions, and the slot-value pairs defined in a domain ontology. However, the complexity of such neural architectures incurs into a considerable latency in the dialogue state prediction, which limits the deployments of the models in real-world applications, particularly when task scalability (i.e. amount of slots) is a crucial factor. In this paper, we propose an innovative neural model for dialogue state tracking, named Global encoder and Slot-Attentive decoders (G-SAT), which can predict the dialogue state with a very low latency time, while maintaining high-level performance. We report experiments on three different languages (English, Italian, and German) of the WoZ2.0 dataset, and show that the proposed approach provides competitive advantages over state-of-art DST systems, both in terms of accuracy and in terms of time complexity for predictions, being over 15 times faster than the other systems.

* 8 pages, 3 figures, Accepted at ASRU 2019 
Viaarxiv icon

Grounding the Lexical Sets of Causative-Inchoative Verbs with Word Embedding

Oct 03, 2016
Edoardo Maria Ponti, Elisabetta Jezek, Bernardo Magnini

Figure 1 for Grounding the Lexical Sets of Causative-Inchoative Verbs with Word Embedding
Figure 2 for Grounding the Lexical Sets of Causative-Inchoative Verbs with Word Embedding

Lexical sets contain the words filling the argument positions of a verb in one of its senses. They can be grounded empirically through their automatic extraction from corpora. The purpose of this paper is demonstrating that their vector representation based on word embedding provides insights onto many linguistic phenomena, and in particular about verbs undergoing the causative-inchoative alternation. A first experiment aims at investigating the internal structure of the sets, which are known to be radial and continuous categories cognitively. A second experiment shows that the distance between the subject set and object set is correlated with a semantic factor, namely the spontaneity of the verb.

* 5 pages, 4 figures, accepted at: Third Italian Conference on Computational Linguistics (CLIC-it). 5-6 December 2016, Napoli (Italy) 
Viaarxiv icon