Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mohammad H. Mahoor

Oversampling Highly Imbalanced Indoor Positioning Data using Deep Generative Models

Aug 30, 2021
Fahad Alhomayani, Mohammad H. Mahoor

Figure 1 for Oversampling Highly Imbalanced Indoor Positioning Data using Deep Generative Models

Figure 2 for Oversampling Highly Imbalanced Indoor Positioning Data using Deep Generative Models

Figure 3 for Oversampling Highly Imbalanced Indoor Positioning Data using Deep Generative Models

Figure 4 for Oversampling Highly Imbalanced Indoor Positioning Data using Deep Generative Models

The location fingerprinting method, which typically utilizes supervised learning, has been widely adopted as a viable solution for the indoor positioning problem. Many indoor positioning datasets are imbalanced. Models trained on imbalanced datasets may exhibit poor performance on the minority class(es). This problem, also known as the "curse of imbalanced data," becomes more evident when class distributions are highly imbalanced. Motivated by the recent advances in deep generative modeling, this paper proposes using Variational Autoencoders and Conditional Variational Autoencoders as oversampling tools to produce class-balanced fingerprints. Experimental results based on Bluetooth Low Energy fingerprints demonstrate that the proposed method outperforms SMOTE and ADASYN in both minority class precision and overall precision. To promote reproducibility and foster new research efforts, we made all the codes associated with this work publicly available.

* to appear in IEEE SENSORS 2021

Via

Access Paper or Ask Questions

RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis

Jun 15, 2021
Rohola Zandie, Mohammad H. Mahoor, Julia Madsen, Eshrat S. Emamian

Figure 1 for RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis

Figure 2 for RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis

Figure 3 for RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis

Figure 4 for RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis

This paper introduces RyanSpeech, a new speech corpus for research on automated text-to-speech (TTS) systems. Publicly available TTS corpora are often noisy, recorded with multiple speakers, or lack quality male speech data. In order to meet the need for a high quality, publicly available male speech corpus within the field of speech recognition, we have designed and created RyanSpeech which contains textual materials from real-world conversational settings. These materials contain over 10 hours of a professional male voice actor's speech recorded at 44.1 kHz. This corpus's design and pipeline make RyanSpeech ideal for developing TTS systems in real-world applications. To provide a baseline for future research, protocols, and benchmarks, we trained 4 state-of-the-art speech models and a vocoder on RyanSpeech. The results show 3.36 in mean opinion scores (MOS) in our best model. We have made both the corpus and trained models for public use.

Via

Access Paper or Ask Questions

Topical Language Generation using Transformers

Mar 11, 2021
Rohola Zandie, Mohammad H. Mahoor

Figure 1 for Topical Language Generation using Transformers

Figure 2 for Topical Language Generation using Transformers

Figure 3 for Topical Language Generation using Transformers

Figure 4 for Topical Language Generation using Transformers

Large-scale transformer-based language models (LMs) demonstrate impressive capabilities in open text generation. However, controlling the generated text's properties such as the topic, style, and sentiment is challenging and often requires significant changes to the model architecture or retraining and fine-tuning the model on new supervised data. This paper presents a novel approach for Topical Language Generation (TLG) by combining a pre-trained LM with topic modeling information. We cast the problem using Bayesian probability formulation with topic probabilities as a prior, LM probabilities as the likelihood, and topical language generation probability as the posterior. In learning the model, we derive the topic probability distribution from the user-provided document's natural structure. Furthermore, we extend our model by introducing new parameters and functions to influence the quantity of the topical features presented in the generated text. This feature would allow us to easily control the topical properties of the generated text. Our experimental results demonstrate that our model outperforms the state-of-the-art results on coherency, diversity, and fluency while being faster in decoding.

* Accepted in the Journal of Natural Language Engineering

Via

Access Paper or Ask Questions

BReG-NeXt: Facial Affect Computing Using Adaptive Residual Networks With Bounded Gradient

Apr 18, 2020
Behzad Hasani, Pooran Singh Negi, Mohammad H. Mahoor

Figure 1 for BReG-NeXt: Facial Affect Computing Using Adaptive Residual Networks With Bounded Gradient

Figure 2 for BReG-NeXt: Facial Affect Computing Using Adaptive Residual Networks With Bounded Gradient

Figure 3 for BReG-NeXt: Facial Affect Computing Using Adaptive Residual Networks With Bounded Gradient

Figure 4 for BReG-NeXt: Facial Affect Computing Using Adaptive Residual Networks With Bounded Gradient

This paper introduces BReG-NeXt, a residual-based network architecture using a function wtih bounded derivative instead of a simple shortcut path (a.k.a. identity mapping) in the residual units for automatic recognition of facial expressions based on the categorical and dimensional models of affect. Compared to ResNet, our proposed adaptive complex mapping results in a shallower network with less numbers of training parameters and floating point operations per second (FLOPs). Adding trainable parameters to the bypass function further improves fitting and training the network and hence recognizing subtle facial expressions such as contempt with a higher accuracy. We conducted comprehensive experiments on the categorical and dimensional models of affect on the challenging in-the-wild databases of AffectNet, FER2013, and Affect-in-Wild. Our experimental results show that our adaptive complex mapping approach outperforms the original ResNet consisting of a simple identity mapping as well as other state-of-the-art methods for Facial Expression Recognition (FER). Various metrics are reported in both affect models to provide a comprehensive evaluation of our method. In the categorical model, BReG-NeXt-50 with only 3.1M training parameters and 15 MFLOPs, achieves 68.50% and 71.53% accuracy on AffectNet and FER2013 databases, respectively. In the dimensional model, BReG-NeXt achieves 0.2577 and 0.2882 RMSE value on AffectNet and Affect-in-Wild databases, respectively.

* To appear in IEEE Transactions on Affective Computing journal

Via

Access Paper or Ask Questions

EmpTransfo: A Multi-head Transformer Architecture for Creating Empathetic Dialog Systems

Mar 05, 2020
Rohola Zandie, Mohammad H. Mahoor

Figure 1 for EmpTransfo: A Multi-head Transformer Architecture for Creating Empathetic Dialog Systems

Figure 2 for EmpTransfo: A Multi-head Transformer Architecture for Creating Empathetic Dialog Systems

Figure 3 for EmpTransfo: A Multi-head Transformer Architecture for Creating Empathetic Dialog Systems

Figure 4 for EmpTransfo: A Multi-head Transformer Architecture for Creating Empathetic Dialog Systems

Understanding emotions and responding accordingly is one of the biggest challenges of dialog systems. This paper presents EmpTransfo, a multi-head Transformer architecture for creating an empathetic dialog system. EmpTransfo utilizes state-of-the-art pre-trained models (e.g., OpenAI-GPT) for language generation, though models with different sizes can be used. We show that utilizing the history of emotions and other metadata can improve the quality of generated conversations by the dialog system. Our experimental results using a challenging language corpus show that the proposed approach outperforms other models in terms of Hit@1 and PPL (Perplexity).

Via

Access Paper or Ask Questions

Delivering Cognitive Behavioral Therapy Using A Conversational SocialRobot

Sep 14, 2019
Francesca Dino, Rohola Zandie, Hojjat Abdollahi, Sarah Schoeder, Mohammad H. Mahoor

Figure 1 for Delivering Cognitive Behavioral Therapy Using A Conversational SocialRobot

Figure 2 for Delivering Cognitive Behavioral Therapy Using A Conversational SocialRobot

Figure 3 for Delivering Cognitive Behavioral Therapy Using A Conversational SocialRobot

Figure 4 for Delivering Cognitive Behavioral Therapy Using A Conversational SocialRobot

Social robots are becoming an integrated part of our daily life due to their ability to provide companionship and entertainment. A subfield of robotics, Socially Assistive Robotics (SAR), is particularly suitable for expanding these benefits into the healthcare setting because of its unique ability to provide cognitive, social, and emotional support. This paper presents our recent research on developing SAR by evaluating the ability of a life-like conversational social robot, called Ryan, to administer internet-delivered cognitive behavioral therapy (iCBT) to older adults with depression. For Ryan to administer the therapy, we developed a dialogue-management system, called Program-R. Using an accredited CBT manual for the treatment of depression, we created seven hour-long iCBT dialogues and integrated them into Program-R using Artificial Intelligence Markup Language (AIML). To assess the effectiveness of Robot-based iCBT and users' likability of our approach, we conducted an HRI study with a cohort of elderly people with mild-to-moderate depression over a period of four weeks. Quantitative analyses of participant's spoken responses (e.g. word count and sentiment analysis), face-scale mood scores, and exit surveys, strongly support the notion robot-based iCBT is a viable alternative to traditional human-delivered therapy.

* Accepted in IROS 2019

Via

Access Paper or Ask Questions

Bounded Residual Gradient Networks (BReG-Net) for Facial Affect Computing

Mar 05, 2019
Behzad Hasani, Pooran Singh Negi, Mohammad H. Mahoor

Figure 1 for Bounded Residual Gradient Networks (BReG-Net) for Facial Affect Computing

Figure 2 for Bounded Residual Gradient Networks (BReG-Net) for Facial Affect Computing

Figure 3 for Bounded Residual Gradient Networks (BReG-Net) for Facial Affect Computing

Figure 4 for Bounded Residual Gradient Networks (BReG-Net) for Facial Affect Computing

Residual-based neural networks have shown remarkable results in various visual recognition tasks including Facial Expression Recognition (FER). Despite the tremendous efforts have been made to improve the performance of FER systems using DNNs, existing methods are not generalizable enough for practical applications. This paper introduces Bounded Residual Gradient Networks (BReG-Net) for facial expression recognition, in which the shortcut connection between the input and the output of the ResNet module is replaced with a differentiable function with a bounded gradient. This configuration prevents the network from facing the vanishing or exploding gradient problem. We show that utilizing such non-linear units will result in shallower networks with better performance. Further, by using a weighted loss function which gives a higher priority to less represented categories, we can achieve an overall better recognition rate. The results of our experiments show that BReG-Nets outperform state-of-the-art methods on three publicly available facial databases in the wild, on both the categorical and dimensional models of affect.

* To appear in 14th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2019)

Via

Access Paper or Ask Questions

Studying the Effects of Deep Brain Stimulation and Medication on the Dynamics of STN-LFP Signals for Human Behavior Analysis

Apr 09, 2018
Hosein M. Golshan, Adam O. Hebb, Joshua Nedrud, Mohammad H. Mahoor

Figure 1 for Studying the Effects of Deep Brain Stimulation and Medication on the Dynamics of STN-LFP Signals for Human Behavior Analysis

Figure 2 for Studying the Effects of Deep Brain Stimulation and Medication on the Dynamics of STN-LFP Signals for Human Behavior Analysis

Figure 3 for Studying the Effects of Deep Brain Stimulation and Medication on the Dynamics of STN-LFP Signals for Human Behavior Analysis

Figure 4 for Studying the Effects of Deep Brain Stimulation and Medication on the Dynamics of STN-LFP Signals for Human Behavior Analysis

This paper presents the results of our recent work on studying the effects of deep brain stimulation (DBS) and medication on the dynamics of brain local field potential (LFP) signals used for behavior analysis of patients with Parkinson s disease (PD). DBS is a technique used to alleviate the severe symptoms of PD when pharmacotherapy is not very effective. Behavior recognition from the LFP signals recorded from the subthalamic nucleus (STN) has application in developing closed-loop DBS systems, where the stimulation pulse is adaptively generated according to subjects performing behavior. Most of the existing studies on behavior recognition that use STN-LFPs are based on the DBS being off. This paper discovers how the performance and accuracy of automated behavior recognition from the LFP signals are affected under different paradigms of stimulation on/off. We first study the notion of beta power suppression in LFP signals under different scenarios (stimulation on/off and medication on/off). Afterward, we explore the accuracy of support vector machines in predicting human actions (button press and reach) using the spectrogram of STN-LFP signals. Our experiments on the recorded LFP signals of three subjects confirm that the beta power is suppressed significantly when the patients take medication (p-value<0.002) or stimulation (p-value<0.0003). The results also show that we can classify different behaviors with a reasonable accuracy of 85% even when the high-amplitude stimulation is applied.

* 40th IEEE International Conference on Engineering in Medicine and Biology (IEEE EMBC), Honolulu, Hawaii, July 17-21, 2018

Via

Access Paper or Ask Questions

A Pilot Study on Using an Intelligent Life-like Robot as a Companion for Elderly Individuals with Dementia and Depression

Dec 07, 2017
Hojjat Abdollahi, Ali Mollahosseini, Josh T. Lane, Mohammad H. Mahoor

Figure 1 for A Pilot Study on Using an Intelligent Life-like Robot as a Companion for Elderly Individuals with Dementia and Depression

Figure 2 for A Pilot Study on Using an Intelligent Life-like Robot as a Companion for Elderly Individuals with Dementia and Depression

Figure 3 for A Pilot Study on Using an Intelligent Life-like Robot as a Companion for Elderly Individuals with Dementia and Depression

Figure 4 for A Pilot Study on Using an Intelligent Life-like Robot as a Companion for Elderly Individuals with Dementia and Depression

This paper presents the design, development, methodology, and the results of a pilot study on using an intelligent, emotive and perceptive social robot (aka Companionbot) for improving the quality of life of elderly people with dementia and/or depression. Ryan Companionbot prototyped in this project, is a rear-projected life-like conversational robot. Ryan is equipped with features that can (1) interpret and respond to users' emotions through facial expressions and spoken language, (2) proactively engage in conversations with users, and (3) remind them about their daily life schedules (e.g. taking their medicine on time). Ryan engages users in cognitive games and reminiscence activities. We conducted a pilot study with six elderly individuals with moderate dementia and/or depression living in a senior living facility in Denver. Each individual had 24/7 access to a Ryan in his/her room for a period of 4-6 weeks. Our observations of these individuals, interviews with them and their caregivers, and analyses of their interactions during this period revealed that they established rapport with the robot and greatly valued and enjoyed having a Companionbot in their room.

* Published in 2017 IEEE-RAS International Conference on Humanoid Robots

Via

Access Paper or Ask Questions

AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild

Oct 09, 2017
Ali Mollahosseini, Behzad Hasani, Mohammad H. Mahoor

Figure 1 for AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild

Figure 2 for AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild

Figure 3 for AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild

Figure 4 for AffectNet: A Database for Facial Expression, Valence, and Arousal Computing in the Wild

Automated affective computing in the wild setting is a challenging problem in computer vision. Existing annotated databases of facial expressions in the wild are small and mostly cover discrete emotions (aka the categorical model). There are very limited annotated facial databases for affective computing in the continuous dimensional model (e.g., valence and arousal). To meet this need, we collected, annotated, and prepared for public distribution a new database of facial emotions in the wild (called AffectNet). AffectNet contains more than 1,000,000 facial images from the Internet by querying three major search engines using 1250 emotion related keywords in six different languages. About half of the retrieved images were manually annotated for the presence of seven discrete facial expressions and the intensity of valence and arousal. AffectNet is by far the largest database of facial expression, valence, and arousal in the wild enabling research in automated facial expression recognition in two different emotion models. Two baseline deep neural networks are used to classify images in the categorical model and predict the intensity of valence and arousal. Various evaluation metrics show that our deep neural network baselines can perform better than conventional machine learning methods and off-the-shelf facial expression recognition systems.

* IEEE Transactions on Affective Computing, 2017

Via

Access Paper or Ask Questions