We consider the task of few-shot intent detection, which involves training a deep learning model to classify utterances based on their underlying intents using only a small amount of labeled data. The current approach to address this problem is through continual pre-training, i.e., fine-tuning pre-trained language models (PLMs) on external resources (e.g., conversational corpora, public intent detection datasets, or natural language understanding datasets) before using them as utterance encoders for training an intent classifier. In this paper, we show that continual pre-training may not be essential, since the overfitting problem of PLMs on this task may not be as serious as expected. Specifically, we find that directly fine-tuning PLMs on only a handful of labeled examples already yields decent results compared to methods that employ continual pre-training, and the performance gap diminishes rapidly as the number of labeled data increases. To maximize the utilization of the limited available data, we propose a context augmentation method and leverage sequential self-distillation to boost performance. Comprehensive experiments on real-world benchmarks show that given only two or more labeled samples per class, direct fine-tuning outperforms many strong baselines that utilize external data sources for continual pre-training. The code can be found at https://github.com/hdzhang-code/DFTPlus.
New intent discovery aims to uncover novel intent categories from user utterances to expand the set of supported intent classes. It is a critical task for the development and service expansion of a practical dialogue system. Despite its importance, this problem remains under-explored in the literature. Existing approaches typically rely on a large amount of labeled utterances and employ pseudo-labeling methods for representation learning and clustering, which are label-intensive, inefficient, and inaccurate. In this paper, we provide new solutions to two important research questions for new intent discovery: (1) how to learn semantic utterance representations and (2) how to better cluster utterances. Particularly, we first propose a multi-task pre-training strategy to leverage rich unlabeled data along with external labeled data for representation learning. Then, we design a new contrastive loss to exploit self-supervisory signals in unlabeled data for clustering. Extensive experiments on three intent recognition benchmarks demonstrate the high effectiveness of our proposed method, which outperforms state-of-the-art methods by a large margin in both unsupervised and semi-supervised scenarios. The source code will be available at \url{https://github.com/zhang-yu-wei/MTP-CLNN}.
It is challenging to train a good intent classifier for a task-oriented dialogue system with only a few annotations. Recent studies have shown that fine-tuning pre-trained language models with a small amount of labeled utterances from public benchmarks in a supervised manner is extremely helpful. However, we find that supervised pre-training yields an anisotropic feature space, which may suppress the expressive power of the semantic representations. Inspired by recent research in isotropization, we propose to improve supervised pre-training by regularizing the feature space towards isotropy. We propose two regularizers based on contrastive learning and correlation matrix respectively, and demonstrate their effectiveness through extensive experiments. Our main finding is that it is promising to regularize supervised pre-training with isotropization to further improve the performance of few-shot intent detection. The source code can be found at https://github.com/fanolabs/isoIntentBert-main.
This paper investigates the effectiveness of pre-training for few-shot intent classification. While existing paradigms commonly further pre-train language models such as BERT on a vast amount of unlabeled corpus, we find it highly effective and efficient to simply fine-tune BERT with a small set of labeled utterances from public datasets. Specifically, fine-tuning BERT with roughly 1,000 labeled data yields a pre-trained model -- IntentBERT, which can easily surpass the performance of existing pre-trained models for few-shot intent classification on novel domains with very different semantics. The high effectiveness of IntentBERT confirms the feasibility and practicality of few-shot intent detection, and its high generalization ability across different domains suggests that intent classification tasks may share a similar underlying structure, which can be efficiently learned from a small set of labeled data. The source code can be found at https://github.com/hdzhang-code/IntentBERT.
Out-of-scope intent detection is of practical importance in task-oriented dialogue systems. Since the distribution of outlier utterances is arbitrary and unknown in the training stage, existing methods commonly rely on strong assumptions on data distribution such as mixture of Gaussians to make inference, resulting in either complex multi-step training procedures or hand-crafted rules such as confidence threshold selection for outlier detection. In this paper, we propose a simple yet effective method to train an out-of-scope intent classifier in a fully end-to-end manner by simulating the test scenario in training, which requires no assumption on data distribution and no additional post-processing or threshold setting. Specifically, we construct a set of pseudo outliers in the training stage, by generating synthetic outliers using inliner features via self-supervision and sampling out-of-scope sentences from easily available open-domain datasets. The pseudo outliers are used to train a discriminative classifier that can be directly applied to and generalize well on the test task. We evaluate our method extensively on four benchmark dialogue datasets and observe significant improvements over state-of-the-art approaches. Our code has been released at https://github.com/liam0949/DCLOOS.
A newly proposed chemical-reaction-inspired metaheurisic, Chemical Reaction Optimization (CRO), has been applied to many optimization problems in both discrete and continuous domains. To alleviate the effort in tuning parameters, this paper reduces the number of optimization parameters in canonical CRO and develops an adaptive scheme to evolve them. Our proposed Adaptive CRO (ACRO) adapts better to different optimization problems. We perform simulations with ACRO on a widely-used benchmark of continuous problems. The simulation results show that ACRO has superior performance over canonical CRO.
The set covering problem (SCP) is one of the representative combinatorial optimization problems, having many practical applications. This paper investigates the development of an algorithm to solve SCP by employing chemical reaction optimization (CRO), a general-purpose metaheuristic. It is tested on a wide range of benchmark instances of SCP. The simulation results indicate that this algorithm gives outstanding performance compared with other heuristics and metaheuristics in solving SCP.
Optimization techniques are frequently applied in science and engineering research and development. Evolutionary algorithms, as a kind of general-purpose metaheuristic, have been shown to be very effective in solving a wide range of optimization problems. A recently proposed chemical-reaction-inspired metaheuristic, Chemical Reaction Optimization (CRO), has been applied to solve many global optimization problems. However, the functionality of the inter-molecular ineffective collision operator in the canonical CRO design overlaps that of the on-wall ineffective collision operator, which can potential impair the overall performance. In this paper we propose a new inter-molecular ineffective collision operator for CRO for global optimization. To fully utilize our newly proposed operator, we also design a scheme to adapt the algorithm to optimization problems with different search space characteristics. We analyze the performance of our proposed algorithm with a number of widely used benchmark functions. The simulation results indicate that the new algorithm has superior performance over the canonical CRO.
An electric vehicle (EV) may be used as energy storage which allows the bi-directional electricity flow between the vehicle's battery and the electric power grid. In order to flatten the load profile of the electricity system, EV scheduling has become a hot research topic in recent years. In this paper, we propose a new formulation of the joint scheduling of EV and Unit Commitment (UC), called EVUC. Our formulation considers the characteristics of EVs while optimizing the system total running cost. We employ Chemical Reaction Optimization (CRO), a general-purpose optimization algorithm to solve this problem and the simulation results on a widely used set of instances indicate that CRO can effectively optimize this problem.
Air pollution monitoring is a very popular research topic and many monitoring systems have been developed. In this paper, we formulate the Bus Sensor Deployment Problem (BSDP) to select the bus routes on which sensors are deployed, and we use Chemical Reaction Optimization (CRO) to solve BSDP. CRO is a recently proposed metaheuristic designed to solve a wide range of optimization problems. Using the real world data, namely Hong Kong Island bus route data, we perform a series of simulations and the results show that CRO is capable of solving this optimization problem efficiently.