Zero-shot dialogue state tracking (DST) transfers knowledge to unseen domains, reducing the cost of annotating new datasets. Previous zero-shot DST models mainly suffer from domain transferring and partial prediction problems. To address these challenges, we propose Mixture of Prefix Experts (MoPE) to establish connections between similar slots in different domains, which strengthens the model transfer performance in unseen domains. Empirical results demonstrate that MoPE-DST achieves the joint goal accuracy of 57.13% on MultiWOZ2.1 and 55.40% on SGD.
Data augmentation (DA) is crucial to mitigate model training instability and over-fitting problems in low-resource open-domain dialogue generation. However, traditional DA methods often neglect semantic data diversity, restricting the overall quality. Recently, large language models (LLM) have been used for DA to generate diversified dialogues. However, they have limited controllability and tend to generate dialogues with a distribution shift compared to the seed dialogues. To maximize the augmentation diversity and address the controllability problem, we propose \textbf{S}ummary-based \textbf{D}ialogue \textbf{A}ugmentation with LLM (SDA). Our approach enhances the controllability of LLM by using dialogue summaries as a planning tool. Based on summaries, SDA can generate high-quality and diverse dialogue data even with a small seed dataset. To evaluate the efficacy of data augmentation methods for open-domain dialogue, we designed a clustering-based metric to characterize the semantic diversity of the augmented dialogue data. The experimental results show that SDA can augment high-quality and semantically diverse dialogues given a small seed dataset and an LLM, and the augmented data can boost the performance of open-domain dialogue models.
Sharing knowledge between information extraction tasks has always been a challenge due to the diverse data formats and task variations. Meanwhile, this divergence leads to information waste and increases difficulties in building complex applications in real scenarios. Recent studies often formulate IE tasks as a triplet extraction problem. However, such a paradigm does not support multi-span and n-ary extraction, leading to weak versatility. To this end, we reorganize IE problems into unified multi-slot tuples and propose a universal framework for various IE tasks, namely Mirror. Specifically, we recast existing IE tasks as a multi-span cyclic graph extraction problem and devise a non-autoregressive graph decoding algorithm to extract all spans in a single step. It is worth noting that this graph structure is incredibly versatile, and it supports not only complex IE tasks, but also machine reading comprehension and classification tasks. We manually construct a corpus containing 57 datasets for model pretraining, and conduct experiments on 30 datasets across 8 downstream tasks. The experimental results demonstrate that our model has decent compatibility and outperforms or reaches competitive performance with SOTA systems under few-shot and zero-shot settings. The code, model weights, and pretraining corpus are available at https://github.com/Spico197/Mirror .
Belief revision and update, two significant types of belief change, both focus on how an agent modify her beliefs in presence of new information. The most striking difference between them is that the former studies the change of beliefs in a static world while the latter concentrates on a dynamically-changing world. The famous AGM and KM postulates were proposed to capture rational belief revision and update, respectively. However, both of them are too permissive to exclude some unreasonable changes in the iteration. In response to this weakness, the DP postulates and its extensions for iterated belief revision were presented. Furthermore, Rodrigues integrated these postulates in belief update. Unfortunately, his approach does not meet the basic requirement of iterated belief update. This paper is intended to solve this problem of Rodrigues's approach. Firstly, we present a modification of the original KM postulates based on belief states. Subsequently, we migrate several well-known postulates for iterated belief revision to iterated belief update. Moreover, we provide the exact semantic characterizations based on partial preorders for each of the proposed postulates. Finally, we analyze the compatibility between the above iterated postulates and the KM postulates for belief update.
Sentence-by-sentence information extraction from long documents is an exhausting and error-prone task. As the indicator of document skeleton, catalogs naturally chunk documents into segments and provide informative cascade semantics, which can help to reduce the search space. Despite their usefulness, catalogs are hard to be extracted without the assist from external knowledge. For documents that adhere to a specific template, regular expressions are practical to extract catalogs. However, handcrafted heuristics are not applicable when processing documents from different sources with diverse formats. To address this problem, we build a large manually annotated corpus, which is the first dataset for the Catalog Extraction from Documents (CED) task. Based on this corpus, we propose a transition-based framework for parsing documents into catalog trees. The experimental results demonstrate that our proposed method outperforms baseline systems and shows a good ability to transfer. We believe the CED task could fill the gap between raw text segments and information extraction tasks on extremely long documents. Data and code are available at \url{https://github.com/Spico197/CatalogExtraction}
The exploration of thermoelectric materials is challenging considering the large materials space, combined with added exponential degrees of freedom coming from doping and the diversity of synthetic pathways. Here we seek to incorporate historical data and update and refine it using experimental feedback by employing error-correction learning (ECL). We thus learn from prior datasets and then adapt the model to differences in synthesis and characterization that are otherwise difficult to parameterize. We then apply this strategy to discovering thermoelectric materials where we prioritize synthesis at temperatures < 300{\deg}C. We document a previously unreported chemical family of thermoelectric materials, PbSe:SnSb, finding that the best candidate in this chemical family, 2 wt% SnSb doped PbSe, exhibits a power factor more than 2x that of PbSe. Our investigations show that our closed-loop experimentation strategy reduces the required number of experiments to find an optimized material by as much as 3x compared to high-throughput searches powered by state-of-the-art machine learning models. We also observe that this improvement is dependent on the accuracy of prior in a manner that exhibits diminishing returns, and after a certain accuracy is reached, it is factors associated with experimental pathways that dictate the trends.
There are two main challenges in document-level event extraction: 1) argument entities are scattered in different sentences, and 2) event triggers are often not available. To address these challenges, most previous studies mainly focus on building argument chains in an autoregressive way, which is inefficient in both training and inference. In contrast to the previous studies, we propose a fast and lightweight model named as PTPCG. We design a non-autoregressive decoding algorithm to perform event argument combination extraction on pruned complete graphs, which are constructed under the guidance of the automatically selected pseudo triggers. Compared to the previous systems, our system achieves competitive results with lower resource consumption, taking only 3.6% GPU time (pfs-days) for training and up to 8.5 times faster for inference. Besides, our approach shows superior compatibility for the datasets with (or without) triggers and the pseudo triggers can be the supplements for annotated triggers to make further improvements.
In recent years, distantly-supervised relation extraction has achieved a certain success by using deep neural networks. Distant Supervision (DS) can automatically generate large-scale annotated data by aligning entity pairs from Knowledge Bases (KB) to sentences. However, these DS-generated datasets inevitably have wrong labels that result in incorrect evaluation scores during testing, which may mislead the researchers. To solve this problem, we build a new dataset NYTH, where we use the DS-generated data as training data and hire annotators to label test data. Compared with the previous datasets, NYT-H has a much larger test set and then we can perform more accurate and consistent evaluation. Finally, we present the experimental results of several widely used systems on NYT-H. The experimental results show that the ranking lists of the comparison systems on the DS-labelled test data and human-annotated test data are different. This indicates that our human-annotated data is necessary for evaluation of distantly-supervised relation extraction.
Understanding and prediction of the chemical reactions are fundamental demanding in the study of many complex chemical systems. Reactive molecular dynamics (MD) simulation has been widely used for this purpose as it can offer atomic details and can help us better interpret chemical reaction mechanisms. In this study, two reference datasets were constructed and corresponding neural network (NN) potentials were trained based on them. For given large-scale reaction systems, the NN potentials can predict the potential energy and atomic forces of DFT precision, while it is orders of magnitude faster than the conventional DFT calculation. With these two models, reactive MD simulations were performed to explore the combustion mechanisms of hydrogen and methane. Benefit from the high efficiency of the NN model, nanosecond MD trajectories for large-scale systems containing hundreds of atoms were produced and detailed combustion mechanism was obtained. Through further development, the algorithms in this study can be used to explore and discovery reaction mechanisms of many complex reaction systems, such as combustion, synthesis, and heterogeneous catalysis without any predefined reaction coordinates and elementary reaction steps.