Alert button
Picture for Hoang Long Nguyen

Hoang Long Nguyen

Alert button

Referring to Screen Texts with Voice Assistants

Jun 10, 2023
Shruti Bhargava, Anand Dhoot, Ing-Marie Jonsson, Hoang Long Nguyen, Alkesh Patel, Hong Yu, Vincent Renkens

Figure 1 for Referring to Screen Texts with Voice Assistants
Figure 2 for Referring to Screen Texts with Voice Assistants
Figure 3 for Referring to Screen Texts with Voice Assistants
Figure 4 for Referring to Screen Texts with Voice Assistants

Voice assistants help users make phone calls, send messages, create events, navigate, and do a lot more. However, assistants have limited capacity to understand their users' context. In this work, we aim to take a step in this direction. Our work dives into a new experience for users to refer to phone numbers, addresses, email addresses, URLs, and dates on their phone screens. Our focus lies in reference understanding, which becomes particularly interesting when multiple similar texts are present on screen, similar to visual grounding. We collect a dataset and propose a lightweight general-purpose model for this novel experience. Due to the high cost of consuming pixels directly, our system is designed to rely on the extracted text from the UI. Our model is modular, thus offering flexibility, improved interpretability, and efficient runtime memory utilization.

* 7 pages, Accepted to ACL Industry Track 2023 
Viaarxiv icon

User-Initiated Repetition-Based Recovery in Multi-Utterance Dialogue Systems

Aug 02, 2021
Hoang Long Nguyen, Vincent Renkens, Joris Pelemans, Srividya Pranavi Potharaju, Anil Kumar Nalamalapu, Murat Akbacak

Figure 1 for User-Initiated Repetition-Based Recovery in Multi-Utterance Dialogue Systems
Figure 2 for User-Initiated Repetition-Based Recovery in Multi-Utterance Dialogue Systems
Figure 3 for User-Initiated Repetition-Based Recovery in Multi-Utterance Dialogue Systems
Figure 4 for User-Initiated Repetition-Based Recovery in Multi-Utterance Dialogue Systems

Recognition errors are common in human communication. Similar errors often lead to unwanted behaviour in dialogue systems or virtual assistants. In human communication, we can recover from them by repeating misrecognized words or phrases; however in human-machine communication this recovery mechanism is not available. In this paper, we attempt to bridge this gap and present a system that allows a user to correct speech recognition errors in a virtual assistant by repeating misunderstood words. When a user repeats part of the phrase the system rewrites the original query to incorporate the correction. This rewrite allows the virtual assistant to understand the original query successfully. We present an end-to-end 2-step attention pointer network that can generate the the rewritten query by merging together the incorrectly understood utterance with the correction follow-up. We evaluate the model on data collected for this task and compare the proposed model to a rule-based baseline and a standard pointer network. We show that rewriting the original query is an effective way to handle repetition-based recovery and that the proposed model outperforms the rule based baseline, reducing Word Error Rate by 19% relative at 2% False Alarm Rate on annotated data.

* Will be published in Interspeech 2021 
Viaarxiv icon

Google Trends Analysis of COVID-19

Nov 07, 2020
Hoang Long Nguyen, Zhenhe Pan, Hashim Abu-gellban, Fang Jin, Yuanlin Zhang

Figure 1 for Google Trends Analysis of COVID-19
Figure 2 for Google Trends Analysis of COVID-19
Figure 3 for Google Trends Analysis of COVID-19
Figure 4 for Google Trends Analysis of COVID-19

The World Health Organization (WHO) announced that COVID-19 was a pandemic disease on the 11th of March as there were 118K cases in several countries and territories. Numerous researchers worked on forecasting the number of confirmed cases since anticipating the growth of the cases helps governments adopting knotty decisions to ease the lockdowns orders for their countries. These orders help several people who have lost their jobs and support gravely impacted businesses. Our research aims to investigate the relation between Google search trends and the spreading of the novel coronavirus (COVID-19) over countries worldwide, to predict the number of cases. We perform a correlation analysis on the keywords of the related Google search trends according to the number of confirmed cases reported by the WHO. After that, we applied several machine learning techniques (Multiple Linear Regression, Non-negative Integer Regression, Deep Neural Network), to forecast the number of confirmed cases globally based on historical data as well as the hybrid data (Google search trends). Our results show that Google search trends are highly associated with the number of reported confirmed cases, where the Deep Learning approach outperforms other forecasting techniques. We believe that it is not only a promising approach for forecasting the confirmed cases of COVID-19, but also for similar forecasting problems that are associated with the related Google trends.

Viaarxiv icon

Alquist: The Alexa Prize Socialbot

Apr 18, 2018
Jan Pichl, Petr Marek, Jakub Konrád, Martin Matulík, Hoang Long Nguyen, Jan Šedivý

Figure 1 for Alquist: The Alexa Prize Socialbot
Figure 2 for Alquist: The Alexa Prize Socialbot
Figure 3 for Alquist: The Alexa Prize Socialbot
Figure 4 for Alquist: The Alexa Prize Socialbot

This paper describes a new open domain dialogue system Alquist developed as part of the Alexa Prize competition for the Amazon Echo line of products. The Alquist dialogue system is designed to conduct a coherent and engaging conversation on popular topics. We are presenting a hybrid system combining several machine learning and rule based approaches. We discuss and describe the Alquist pipeline, data acquisition, and processing, dialogue manager, NLG, knowledge aggregation and hierarchy of sub-dialogs. We present some of the experimental results.

Viaarxiv icon