Alert button
Picture for Mohammad Kachuee

Mohammad Kachuee

Alert button

Data Augmentation for Improving Tail-traffic Robustness in Skill-routing for Dialogue Systems

Jun 07, 2023
Ting-Wei Wu, Fatemeh Sheikholeslami, Mohammad Kachuee, Jaeyoung Do, Sungjin Lee

Figure 1 for Data Augmentation for Improving Tail-traffic Robustness in Skill-routing for Dialogue Systems
Figure 2 for Data Augmentation for Improving Tail-traffic Robustness in Skill-routing for Dialogue Systems
Figure 3 for Data Augmentation for Improving Tail-traffic Robustness in Skill-routing for Dialogue Systems
Figure 4 for Data Augmentation for Improving Tail-traffic Robustness in Skill-routing for Dialogue Systems

Large-scale conversational systems typically rely on a skill-routing component to route a user request to an appropriate skill and interpretation to serve the request. In such system, the agent is responsible for serving thousands of skills and interpretations which create a long-tail distribution due to the natural frequency of requests. For example, the samples related to play music might be a thousand times more frequent than those asking for theatre show times. Moreover, inputs used for ML-based skill routing are often a heterogeneous mix of strings, embedding vectors, categorical and scalar features which makes employing augmentation-based long-tail learning approaches challenging. To improve the skill-routing robustness, we propose an augmentation of heterogeneous skill-routing data and training targeted for robust operation in long-tail data regimes. We explore a variety of conditional encoder-decoder generative frameworks to perturb original data fields and create synthetic training data. To demonstrate the effectiveness of the proposed method, we conduct extensive experiments using real-world data from a commercial conversational system. Based on the experiment results, the proposed approach improves more than 80% (51 out of 63) of intents with less than 10K of traffic instances in the skill-routing replication task.

Viaarxiv icon

Scalable and Safe Remediation of Defective Actions in Self-Learning Conversational Systems

May 17, 2023
Sarthak Ahuja, Mohammad Kachuee, Fateme Sheikholeslami, Weiqing Liu, Jaeyoung Do

Figure 1 for Scalable and Safe Remediation of Defective Actions in Self-Learning Conversational Systems
Figure 2 for Scalable and Safe Remediation of Defective Actions in Self-Learning Conversational Systems
Figure 3 for Scalable and Safe Remediation of Defective Actions in Self-Learning Conversational Systems
Figure 4 for Scalable and Safe Remediation of Defective Actions in Self-Learning Conversational Systems

Off-Policy reinforcement learning has been a driving force for the state-of-the-art conversational AIs leading to more natural humanagent interactions and improving the user satisfaction for goal-oriented agents. However, in large-scale commercial settings, it is often challenging to balance between policy improvements and experience continuity on the broad spectrum of applications handled by such system. In the literature, off-policy evaluation and guard-railing on aggregate statistics has been commonly used to address this problem. In this paper, we propose a method for curating and leveraging high-precision samples sourced from historical regression incident reports to validate, safe-guard, and improve policies prior to the online deployment. We conducted extensive experiments using data from a real-world conversational system and actual regression incidents. The proposed method is currently deployed in our production system to protect customers against broken experiences and enable long-term policy improvements.

* Accepted at ACL 2023 Industry Track 
Viaarxiv icon

Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems

Sep 17, 2022
Mohammad Kachuee, Sungjin Lee

Figure 1 for Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems
Figure 2 for Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems
Figure 3 for Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems
Figure 4 for Constrained Policy Optimization for Controlled Self-Learning in Conversational AI Systems

Recently, self-learning methods based on user satisfaction metrics and contextual bandits have shown promising results to enable consistent improvements in conversational AI systems. However, directly targeting such metrics by off-policy bandit learning objectives often increases the risk of making abrupt policy changes that break the current user experience. In this study, we introduce a scalable framework for supporting fine-grained exploration targets for individual domains via user-defined constraints. For example, we may want to ensure fewer policy deviations in business-critical domains such as shopping, while allocating more exploration budget to domains such as music. Furthermore, we present a novel meta-gradient learning approach that is scalable and practical to address this problem. The proposed method adjusts constraint violation penalty terms adaptively through a meta objective that encourages balanced constraint satisfaction across domains. We conduct extensive experiments using data from a real-world conversational AI on a set of realistic constraint benchmarks. Based on the experimental results, we demonstrate that the proposed approach is capable of achieving the best balance between the policy value and constraint satisfaction rate.

Viaarxiv icon

Scalable and Robust Self-Learning for Skill Routing in Large-Scale Conversational AI Systems

Apr 14, 2022
Mohammad Kachuee, Jinseok Nam, Sarthak Ahuja, Jin-Myung Won, Sungjin Lee

Figure 1 for Scalable and Robust Self-Learning for Skill Routing in Large-Scale Conversational AI Systems
Figure 2 for Scalable and Robust Self-Learning for Skill Routing in Large-Scale Conversational AI Systems
Figure 3 for Scalable and Robust Self-Learning for Skill Routing in Large-Scale Conversational AI Systems
Figure 4 for Scalable and Robust Self-Learning for Skill Routing in Large-Scale Conversational AI Systems

Skill routing is an important component in large-scale conversational systems. In contrast to traditional rule-based skill routing, state-of-the-art systems use a model-based approach to enable natural conversations. To provide supervision signal required to train such models, ideas such as human annotation, replication of a rule-based system, relabeling based on user paraphrases, and bandit-based learning were suggested. However, these approaches: (a) do not scale in terms of the number of skills and skill on-boarding, (b) require a very costly expert annotation/rule-design, (c) introduce risks in the user experience with each model update. In this paper, we present a scalable self-learning approach to explore routing alternatives without causing abrupt policy changes that break the user experience, learn from the user interaction, and incrementally improve the routing via frequent model refreshes. To enable such robust frequent model updates, we suggest a simple and effective approach that ensures controlled policy updates for individual domains, followed by an off-policy evaluation for making deployment decisions without any need for lengthy A/B experimentation. We conduct various offline and online A/B experiments on a commercial large-scale conversational system to demonstrate the effectiveness of the proposed method in real-world production settings.

* NAACL 2022 
Viaarxiv icon

Domain-Aware Contrastive Knowledge Transfer for Multi-domain Imbalanced Data

Apr 05, 2022
Zixuan Ke, Mohammad Kachuee, Sungjin Lee

Figure 1 for Domain-Aware Contrastive Knowledge Transfer for Multi-domain Imbalanced Data
Figure 2 for Domain-Aware Contrastive Knowledge Transfer for Multi-domain Imbalanced Data
Figure 3 for Domain-Aware Contrastive Knowledge Transfer for Multi-domain Imbalanced Data
Figure 4 for Domain-Aware Contrastive Knowledge Transfer for Multi-domain Imbalanced Data

In many real-world machine learning applications, samples belong to a set of domains e.g., for product reviews each review belongs to a product category. In this paper, we study multi-domain imbalanced learning (MIL), the scenario that there is imbalance not only in classes but also in domains. In the MIL setting, different domains exhibit different patterns and there is a varying degree of similarity and divergence among domains posing opportunities and challenges for transfer learning especially when faced with limited or insufficient training data. We propose a novel domain-aware contrastive knowledge transfer method called DCMI to (1) identify the shared domain knowledge to encourage positive transfer among similar domains (in particular from head domains to tail domains); (2) isolate the domain-specific knowledge to minimize the negative transfer from dissimilar domains. We evaluated the performance of DCMI on three different datasets showing significant improvements in different MIL scenarios.

* ACL WASSA 2022 
Viaarxiv icon

Real-Time Decentralized knowledge Transfer at the Edge

Nov 11, 2020
Orpaz Goldstein, Mohammad Kachuee, Dereck Shiell, Majid Sarrafzadeh

Figure 1 for Real-Time Decentralized knowledge Transfer at the Edge
Figure 2 for Real-Time Decentralized knowledge Transfer at the Edge
Figure 3 for Real-Time Decentralized knowledge Transfer at the Edge
Figure 4 for Real-Time Decentralized knowledge Transfer at the Edge

Proliferation of edge networks creates islands of learning agents working on local streams of data. Transferring knowledge between these agents in real-time without exposing private data allows for collaboration to decrease learning time, and increase model confidence. Incorporating knowledge from data that was not seen by a local model creates an ability to debias a local model, or add to classification abilities on data never before seen. Transferring knowledge in a decentralized approach allows for models to retain their local insights, in turn allowing for local flavors of a machine learning model. This approach suits the decentralized architecture of edge networks, as a local edge node will serve a community of learning agents that will likely encounter similar data. We propose a method based on knowledge distillation for pairwise knowledge transfer pipelines, and compare to other popular knowledge transfer methods. Additionally, we test different scenarios of knowledge transfer network construction and show the practicality of our approach. Based on our experiments we show knowledge transfer using our model outperforms common methods in a real time transfer scenario.

Viaarxiv icon

Self-Supervised Contrastive Learning for Efficient User Satisfaction Prediction in Conversational Agents

Oct 21, 2020
Mohammad Kachuee, Hao Yuan, Young-Bum Kim, Sungjin Lee

Figure 1 for Self-Supervised Contrastive Learning for Efficient User Satisfaction Prediction in Conversational Agents
Figure 2 for Self-Supervised Contrastive Learning for Efficient User Satisfaction Prediction in Conversational Agents
Figure 3 for Self-Supervised Contrastive Learning for Efficient User Satisfaction Prediction in Conversational Agents
Figure 4 for Self-Supervised Contrastive Learning for Efficient User Satisfaction Prediction in Conversational Agents

Turn-level user satisfaction is one of the most important performance metrics for conversational agents. It can be used to monitor the agent's performance and provide insights about defective user experiences. Moreover, a powerful satisfaction model can be used as an objective function that a conversational agent continuously optimizes for. While end-to-end deep learning has shown promising results, having access to a large number of reliable annotated samples required by these methods remains challenging. In a large-scale conversational system, there is a growing number of newly developed skills, making the traditional data collection, annotation, and modeling process impractical due to the required annotation costs as well as the turnaround times. In this paper, we suggest a self-supervised contrastive learning approach that leverages the pool of unlabeled data to learn user-agent interactions. We show that the pre-trained models using the self-supervised objective are transferable to the user satisfaction prediction. In addition, we propose a novel few-shot transfer learning approach that ensures better transferability for very small sample sizes. The suggested few-shot method does not require any inner loop optimization process and is scalable to very large datasets and complex models. Based on our experiments using real-world data from a large-scale commercial system, the suggested approach is able to significantly reduce the required number of annotations, while improving the generalization on unseen out-of-domain skills.

Viaarxiv icon

Group-Connected Multilayer Perceptron Networks

Dec 20, 2019
Mohammad Kachuee, Sajad Darabi, Shayan Fazeli, Majid Sarrafzadeh

Figure 1 for Group-Connected Multilayer Perceptron Networks
Figure 2 for Group-Connected Multilayer Perceptron Networks
Figure 3 for Group-Connected Multilayer Perceptron Networks
Figure 4 for Group-Connected Multilayer Perceptron Networks

Despite the success of deep learning in domains such as image, voice, and graphs, there has been little progress in deep representation learning for domains without a known structure between features. For instance, a tabular dataset of different demographic and clinical factors where the feature interactions are not given as a prior. In this paper, we propose Group-Connected Multilayer Perceptron (GMLP) networks to enable deep representation learning in these domains. GMLP is based on the idea of learning expressive feature combinations (groups) and exploiting them to reduce the network complexity by defining local group-wise operations. During the training phase, GMLP learns a sparse feature grouping matrix using temperature annealing softmax with an added entropy loss term to encourage the sparsity. Furthermore, an architecture is suggested which resembles binary trees, where group-wise operations are followed by pooling operations to combine information; reducing the number of groups as the network grows in depth. To evaluate the proposed method, we conducted experiments on five different real-world datasets covering various application areas. Additionally, we provide visualizations on MNIST and synthesized data. According to the results, GMLP is able to successfully learn and exploit expressive feature combinations and achieve state-of-the-art classification performance on different datasets.

Viaarxiv icon

Cost-Sensitive Feature-Value Acquisition Using Feature Relevance

Dec 19, 2019
Kimmo Kärkkäinen, Mohammad Kachuee, Orpaz Goldstein, Majid Sarrafzadeh

Figure 1 for Cost-Sensitive Feature-Value Acquisition Using Feature Relevance
Figure 2 for Cost-Sensitive Feature-Value Acquisition Using Feature Relevance
Figure 3 for Cost-Sensitive Feature-Value Acquisition Using Feature Relevance
Figure 4 for Cost-Sensitive Feature-Value Acquisition Using Feature Relevance

In many real-world machine learning problems, feature values are not readily available. To make predictions, some of the missing features have to be acquired, which can incur a cost in money, computational time, or human time, depending on the problem domain. This leads us to the problem of choosing which features to use at the prediction time. The chosen features should increase the prediction accuracy for a low cost, but determining which features will do that is challenging. The choice should take into account the previously acquired feature values as well as the feature costs. This paper proposes a novel approach to address this problem. The proposed approach chooses the most useful features adaptively based on how relevant they are for the prediction task as well as what the corresponding feature costs are. Our approach uses a generic neural network architecture, which is suitable for a wide range of problems. We evaluate our approach on three cost-sensitive datasets, including Yahoo! Learning to Rank Competition dataset as well as two health datasets. We show that our approach achieves high accuracy with a lower cost than the current state-of-the-art approaches.

Viaarxiv icon