Alert button
Picture for Ahmed Aly

Ahmed Aly

Alert button

Retrieve-and-Fill for Scenario-based Task-Oriented Semantic Parsing

Feb 02, 2022
Akshat Shrivastava, Shrey Desai, Anchit Gupta, Ali Elkahky, Aleksandr Livshits, Alexander Zotov, Ahmed Aly

Figure 1 for Retrieve-and-Fill for Scenario-based Task-Oriented Semantic Parsing
Figure 2 for Retrieve-and-Fill for Scenario-based Task-Oriented Semantic Parsing
Figure 3 for Retrieve-and-Fill for Scenario-based Task-Oriented Semantic Parsing
Figure 4 for Retrieve-and-Fill for Scenario-based Task-Oriented Semantic Parsing

Task-oriented semantic parsing models have achieved strong results in recent years, but unfortunately do not strike an appealing balance between model size, runtime latency, and cross-domain generalizability. We tackle this problem by introducing scenario-based semantic parsing: a variant of the original task which first requires disambiguating an utterance's "scenario" (an intent-slot template with variable leaf spans) before generating its frame, complete with ontology and utterance tokens. This formulation enables us to isolate coarse-grained and fine-grained aspects of the task, each of which we solve with off-the-shelf neural modules, also optimizing for the axes outlined above. Concretely, we create a Retrieve-and-Fill (RAF) architecture comprised of (1) a retrieval module which ranks the best scenario given an utterance and (2) a filling module which imputes spans into the scenario to create the frame. Our model is modular, differentiable, interpretable, and allows us to garner extra supervision from scenarios. RAF achieves strong results in high-resource, low-resource, and multilingual settings, outperforming recent approaches by wide margins despite, using base pre-trained encoders, small sequence lengths, and parallel decoding.

Viaarxiv icon

AutoNLU: Detecting, root-causing, and fixing NLU model errors

Oct 12, 2021
Pooja Sethi, Denis Savenkov, Forough Arabshahi, Jack Goetz, Micaela Tolliver, Nicolas Scheffer, Ilknur Kabul, Yue Liu, Ahmed Aly

Figure 1 for AutoNLU: Detecting, root-causing, and fixing NLU model errors
Figure 2 for AutoNLU: Detecting, root-causing, and fixing NLU model errors
Figure 3 for AutoNLU: Detecting, root-causing, and fixing NLU model errors
Figure 4 for AutoNLU: Detecting, root-causing, and fixing NLU model errors

Improving the quality of Natural Language Understanding (NLU) models, and more specifically, task-oriented semantic parsing models, in production is a cumbersome task. In this work, we present a system called AutoNLU, which we designed to scale the NLU quality improvement process. It adds automation to three key steps: detection, attribution, and correction of model errors, i.e., bugs. We detected four times more failed tasks than with random sampling, finding that even a simple active learning sampling method on an uncalibrated model is surprisingly effective for this purpose. The AutoNLU tool empowered linguists to fix ten times more semantic parsing bugs than with prior manual processes, auto-correcting 65% of all identified bugs.

* 8 pages, 5 figures 
Viaarxiv icon

Assessing Data Efficiency in Task-Oriented Semantic Parsing

Jul 10, 2021
Shrey Desai, Akshat Shrivastava, Justin Rill, Brian Moran, Safiyyah Saleem, Alexander Zotov, Ahmed Aly

Figure 1 for Assessing Data Efficiency in Task-Oriented Semantic Parsing
Figure 2 for Assessing Data Efficiency in Task-Oriented Semantic Parsing
Figure 3 for Assessing Data Efficiency in Task-Oriented Semantic Parsing
Figure 4 for Assessing Data Efficiency in Task-Oriented Semantic Parsing

Data efficiency, despite being an attractive characteristic, is often challenging to measure and optimize for in task-oriented semantic parsing; unlike exact match, it can require both model- and domain-specific setups, which have, historically, varied widely across experiments. In our work, as a step towards providing a unified solution to data-efficiency-related questions, we introduce a four-stage protocol which gives an approximate measure of how much in-domain, "target" data a parser requires to achieve a certain quality bar. Specifically, our protocol consists of (1) sampling target subsets of different cardinalities, (2) fine-tuning parsers on each subset, (3) obtaining a smooth curve relating target subset (%) vs. exact match (%), and (4) referencing the curve to mine ad-hoc (target subset, exact match) points. We apply our protocol in two real-world case studies -- model generalizability and intent complexity -- illustrating its flexibility and applicability to practitioners in task-oriented semantic parsing.

Viaarxiv icon

Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

Jun 25, 2021
David Eriksson, Pierce I-Jen Chuang, Samuel Daulton, Peng Xia, Akshat Shrivastava, Arun Babu, Shicong Zhao, Ahmed Aly, Ganesh Venkatesh, Maximilian Balandat

Figure 1 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization
Figure 2 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization
Figure 3 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization
Figure 4 for Latency-Aware Neural Architecture Search with Multi-Objective Bayesian Optimization

When tuning the architecture and hyperparameters of large machine learning models for on-device deployment, it is desirable to understand the optimal trade-offs between on-device latency and model accuracy. In this work, we leverage recent methodological advances in Bayesian optimization over high-dimensional search spaces and multi-objective Bayesian optimization to efficiently explore these trade-offs for a production-scale on-device natural language understanding model at Facebook.

* To Appear at the 8th ICML Workshop on Automated Machine Learning, ICML 2021 
Viaarxiv icon

Diagnosing Transformers in Task-Oriented Semantic Parsing

May 27, 2021
Shrey Desai, Ahmed Aly

Figure 1 for Diagnosing Transformers in Task-Oriented Semantic Parsing
Figure 2 for Diagnosing Transformers in Task-Oriented Semantic Parsing
Figure 3 for Diagnosing Transformers in Task-Oriented Semantic Parsing
Figure 4 for Diagnosing Transformers in Task-Oriented Semantic Parsing

Modern task-oriented semantic parsing approaches typically use seq2seq transformers to map textual utterances to semantic frames comprised of intents and slots. While these models are empirically strong, their specific strengths and weaknesses have largely remained unexplored. In this work, we study BART and XLM-R, two state-of-the-art parsers, across both monolingual and multilingual settings. Our experiments yield several key results: transformer-based parsers struggle not only with disambiguating intents/slots, but surprisingly also with producing syntactically-valid frames. Though pre-training imbues transformers with syntactic inductive biases, we find the ambiguity of copying utterance spans into frames often leads to tree invalidity, indicating span extraction is a major bottleneck for current parsers. However, as a silver lining, we show transformer-based parsers give sufficient indicators for whether a frame is likely to be correct or incorrect, making them easier to deploy in production settings.

* Accepted to Findings of ACL 2021 
Viaarxiv icon

Span Pointer Networks for Non-Autoregressive Task-Oriented Semantic Parsing

Apr 16, 2021
Akshat Shrivastava, Pierce Chuang, Arun Babu, Shrey Desai, Abhinav Arora, Alexander Zotov, Ahmed Aly

Figure 1 for Span Pointer Networks for Non-Autoregressive Task-Oriented Semantic Parsing
Figure 2 for Span Pointer Networks for Non-Autoregressive Task-Oriented Semantic Parsing
Figure 3 for Span Pointer Networks for Non-Autoregressive Task-Oriented Semantic Parsing
Figure 4 for Span Pointer Networks for Non-Autoregressive Task-Oriented Semantic Parsing

An effective recipe for building seq2seq, non-autoregressive, task-oriented parsers to map utterances to semantic frames proceeds in three steps: encoding an utterance $x$, predicting a frame's length |y|, and decoding a |y|-sized frame with utterance and ontology tokens. Though empirically strong, these models are typically bottlenecked by length prediction, as even small inaccuracies change the syntactic and semantic characteristics of resulting frames. In our work, we propose span pointer networks, non-autoregressive parsers which shift the decoding task from text generation to span prediction; that is, when imputing utterance spans into frame slots, our model produces endpoints (e.g., [i, j]) as opposed to text (e.g., "6pm"). This natural quantization of the output space reduces the variability of gold frames, therefore improving length prediction and, ultimately, exact match. Furthermore, length prediction is now responsible for frame syntax and the decoder is responsible for frame semantics, resulting in a coarse-to-fine model. We evaluate our approach on several task-oriented semantic parsing datasets. Notably, we bridge the quality gap between non-autogressive and autoregressive parsers, achieving 87 EM on TOPv2 (Chen et al. 2020). Furthermore, due to our more consistent gold frames, we show strong improvements in model generalization in both cross-domain and cross-lingual transfer in low-resource settings. Finally, due to our diminished output vocabulary, we observe 70% reduction in latency and 83% reduction in memory at beam size 5 compared to prior non-autoregressive parsers.

Viaarxiv icon

Low-Resource Task-Oriented Semantic Parsing via Intrinsic Modeling

Apr 15, 2021
Shrey Desai, Akshat Shrivastava, Alexander Zotov, Ahmed Aly

Figure 1 for Low-Resource Task-Oriented Semantic Parsing via Intrinsic Modeling
Figure 2 for Low-Resource Task-Oriented Semantic Parsing via Intrinsic Modeling
Figure 3 for Low-Resource Task-Oriented Semantic Parsing via Intrinsic Modeling
Figure 4 for Low-Resource Task-Oriented Semantic Parsing via Intrinsic Modeling

Task-oriented semantic parsing models typically have high resource requirements: to support new ontologies (i.e., intents and slots), practitioners crowdsource thousands of samples for supervised fine-tuning. Partly, this is due to the structure of de facto copy-generate parsers; these models treat ontology labels as discrete entities, relying on parallel data to extrinsically derive their meaning. In our work, we instead exploit what we intrinsically know about ontology labels; for example, the fact that SL:TIME_ZONE has the categorical type "slot" and language-based span "time zone". Using this motivation, we build our approach with offline and online stages. During preprocessing, for each ontology label, we extract its intrinsic properties into a component, and insert each component into an inventory as a cache of sorts. During training, we fine-tune a seq2seq, pre-trained transformer to map utterances and inventories to frames, parse trees comprised of utterance and ontology tokens. Our formulation encourages the model to consider ontology labels as a union of its intrinsic properties, therefore substantially bootstrapping learning in low-resource settings. Experiments show our model is highly sample efficient: using a low-resource benchmark derived from TOPv2, our inventory parser outperforms a copy-generate parser by +15 EM absolute (44% relative) when fine-tuning on 10 samples from an unseen domain.

Viaarxiv icon

Non-Autoregressive Semantic Parsing for Compositional Task-Oriented Dialog

Apr 11, 2021
Arun Babu, Akshat Shrivastava, Armen Aghajanyan, Ahmed Aly, Angela Fan, Marjan Ghazvininejad

Figure 1 for Non-Autoregressive Semantic Parsing for Compositional Task-Oriented Dialog
Figure 2 for Non-Autoregressive Semantic Parsing for Compositional Task-Oriented Dialog
Figure 3 for Non-Autoregressive Semantic Parsing for Compositional Task-Oriented Dialog
Figure 4 for Non-Autoregressive Semantic Parsing for Compositional Task-Oriented Dialog

Semantic parsing using sequence-to-sequence models allows parsing of deeper representations compared to traditional word tagging based models. In spite of these advantages, widespread adoption of these models for real-time conversational use cases has been stymied by higher compute requirements and thus higher latency. In this work, we propose a non-autoregressive approach to predict semantic parse trees with an efficient seq2seq model architecture. By combining non-autoregressive prediction with convolutional neural networks, we achieve significant latency gains and parameter size reduction compared to traditional RNN models. Our novel architecture achieves up to an 81% reduction in latency on TOP dataset and retains competitive performance to non-pretrained models on three different semantic parsing datasets. Our code is available at https://github.com/facebookresearch/pytext

Viaarxiv icon