Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Huamin Qu

NumGPT: Improving Numeracy Ability of Generative Pre-trained Models

Sep 07, 2021

Zhihua Jin, Xin Jiang, Xingbo Wang, Qun Liu, Yong Wang, Xiaozhe Ren, Huamin Qu

Figure 1 for NumGPT: Improving Numeracy Ability of Generative Pre-trained Models

Figure 2 for NumGPT: Improving Numeracy Ability of Generative Pre-trained Models

Figure 3 for NumGPT: Improving Numeracy Ability of Generative Pre-trained Models

Figure 4 for NumGPT: Improving Numeracy Ability of Generative Pre-trained Models

Abstract:Existing generative pre-trained language models (e.g., GPT) focus on modeling the language structure and semantics of general texts. However, those models do not consider the numerical properties of numbers and cannot perform robustly on numerical reasoning tasks (e.g., math word problems and measurement estimation). In this paper, we propose NumGPT, a generative pre-trained model that explicitly models the numerical properties of numbers in texts. Specifically, it leverages a prototype-based numeral embedding to encode the mantissa of the number and an individual embedding to encode the exponent of the number. A numeral-aware loss function is designed to integrate numerals into the pre-training objective of NumGPT. We conduct extensive experiments on four different datasets to evaluate the numeracy ability of NumGPT. The experiment results show that NumGPT outperforms baseline models (e.g., GPT and GPT with DICE) on a range of numerical reasoning tasks such as measurement estimation, number comparison, math word problems, and magnitude classification. Ablation studies are also conducted to evaluate the impact of pre-training and model hyperparameters on the performance.

* 8 pages, 3 figures

Via

Access Paper or Ask Questions

VBridge: Connecting the Dots Between Features, Explanations, and Data for Healthcare Models

Aug 04, 2021

Furui Cheng, Dongyu Liu, Fan Du, Yanna Lin, Alexandra Zytek, Haomin Li, Huamin Qu, Kalyan Veeramachaneni

Figure 1 for VBridge: Connecting the Dots Between Features, Explanations, and Data for Healthcare Models

Figure 2 for VBridge: Connecting the Dots Between Features, Explanations, and Data for Healthcare Models

Figure 3 for VBridge: Connecting the Dots Between Features, Explanations, and Data for Healthcare Models

Figure 4 for VBridge: Connecting the Dots Between Features, Explanations, and Data for Healthcare Models

Abstract:Machine learning (ML) is increasingly applied to Electronic Health Records (EHRs) to solve clinical prediction tasks. Although many ML models perform promisingly, issues with model transparency and interpretability limit their adoption in clinical practice. Directly using existing explainable ML techniques in clinical settings can be challenging. Through literature surveys and collaborations with six clinicians with an average of 17 years of clinical experience, we identified three key challenges, including clinicians' unfamiliarity with ML features, lack of contextual information, and the need for cohort-level evidence. Following an iterative design process, we further designed and developed VBridge, a visual analytics tool that seamlessly incorporates ML explanations into clinicians' decision-making workflow. The system includes a novel hierarchical display of contribution-based feature explanations and enriched interactions that connect the dots between ML features, explanations, and data. We demonstrated the effectiveness of VBridge through two case studies and expert interviews with four clinicians, showing that visually associating model explanations with patients' situational records can help clinicians better interpret and use model predictions when making clinician decisions. We further derived a list of design implications for developing future explainable ML tools to support clinical decision-making.

* Accepted to IEEE VIS 2021, To Appeal in IEEE Transactions on Visualization and Computer Graphics

Via

Access Paper or Ask Questions

M2Lens: Visualizing and Explaining Multimodal Models for Sentiment Analysis

Aug 01, 2021

Xingbo Wang, Jianben He, Zhihua Jin, Muqiao Yang, Yong Wang, Huamin Qu

Figure 1 for M2Lens: Visualizing and Explaining Multimodal Models for Sentiment Analysis

Figure 2 for M2Lens: Visualizing and Explaining Multimodal Models for Sentiment Analysis

Figure 3 for M2Lens: Visualizing and Explaining Multimodal Models for Sentiment Analysis

Figure 4 for M2Lens: Visualizing and Explaining Multimodal Models for Sentiment Analysis

Abstract:Multimodal sentiment analysis aims to recognize people's attitudes from multiple communication channels such as verbal content (i.e., text), voice, and facial expressions. It has become a vibrant and important research topic in natural language processing. Much research focuses on modeling the complex intra- and inter-modal interactions between different communication channels. However, current multimodal models with strong performance are often deep-learning-based techniques and work like black boxes. It is not clear how models utilize multimodal information for sentiment predictions. Despite recent advances in techniques for enhancing the explainability of machine learning models, they often target unimodal scenarios (e.g., images, sentences), and little research has been done on explaining multimodal models. In this paper, we present an interactive visual analytics system, M2Lens, to visualize and explain multimodal models for sentiment analysis. M2Lens provides explanations on intra- and inter-modal interactions at the global, subset, and local levels. Specifically, it summarizes the influence of three typical interaction types (i.e., dominance, complement, and conflict) on the model predictions. Moreover, M2Lens identifies frequent and influential multimodal features and supports the multi-faceted exploration of model behaviors from language, acoustic, and visual modalities. Through two case studies and expert interviews, we demonstrate our system can help users gain deep insights into the multimodal models for sentiment analysis.

* 11 pages, 7 figures. This paper is accepted by IEEE VIS, 2021. To appear in IEEE Transactions on Visualization and Computer Graphics (TVCG)

Via

Access Paper or Ask Questions

DeHumor: Visual Analytics for Decomposing Humor

Jul 18, 2021

Xingbo Wang, Yao Ming, Tongshuang Wu, Haipeng Zeng, Yong Wang, Huamin Qu

Figure 1 for DeHumor: Visual Analytics for Decomposing Humor

Figure 2 for DeHumor: Visual Analytics for Decomposing Humor

Figure 3 for DeHumor: Visual Analytics for Decomposing Humor

Figure 4 for DeHumor: Visual Analytics for Decomposing Humor

Abstract:Despite being a critical communication skill, grasping humor is challenging -- a successful use of humor requires a mixture of both engaging content build-up and an appropriate vocal delivery (e.g., pause). Prior studies on computational humor emphasize the textual and audio features immediately next to the punchline, yet overlooking longer-term context setup. Moreover, the theories are usually too abstract for understanding each concrete humor snippet. To fill in the gap, we develop DeHumor, a visual analytical system for analyzing humorous behaviors in public speaking. To intuitively reveal the building blocks of each concrete example, DeHumor decomposes each humorous video into multimodal features and provides inline annotations of them on the video script. In particular, to better capture the build-ups, we introduce content repetition as a complement to features introduced in theories of computational humor and visualize them in a context linking graph. To help users locate the punchlines that have the desired features to learn, we summarize the content (with keywords) and humor feature statistics on an augmented time matrix. With case studies on stand-up comedy shows and TED talks, we show that DeHumor is able to highlight various building blocks of humor examples. In addition, expert interviews with communication coaches and humor researchers demonstrate the effectiveness of DeHumor for multimodal humor analysis of speech content and vocal delivery.

* 15 pages. A preprint version of a publication at IEEE Transactions on Visualization and Computer Graphics (TVCG), 2021

Via

Access Paper or Ask Questions

GNNVis: A Visual Analytics Approach for Prediction Error Diagnosis of Graph Neural Networks

Dec 03, 2020

Zhihua Jin, Yong Wang, Qianwen Wang, Yao Ming, Tengfei Ma, Huamin Qu

Figure 1 for GNNVis: A Visual Analytics Approach for Prediction Error Diagnosis of Graph Neural Networks

Figure 2 for GNNVis: A Visual Analytics Approach for Prediction Error Diagnosis of Graph Neural Networks

Figure 3 for GNNVis: A Visual Analytics Approach for Prediction Error Diagnosis of Graph Neural Networks

Figure 4 for GNNVis: A Visual Analytics Approach for Prediction Error Diagnosis of Graph Neural Networks

Abstract:Graph Neural Networks (GNNs) aim to extend deep learning techniques to graph data and have achieved significant progress in graph analysis tasks (e.g., node classification) in recent years. However, similar to other deep neural networks like Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), GNNs behave like a black box with their details hidden from model developers and users. It is therefore difficult to diagnose possible errors of GNNs. Despite many visual analytics studies being done on CNNs and RNNs, little research has addressed the challenges for GNNs. This paper fills the research gap with an interactive visual analysis tool, GNNVis, to assist model developers and users in understanding and analyzing GNNs. Specifically, Parallel Sets View and Projection View enable users to quickly identify and validate error patterns in the set of wrong predictions; Graph View and Feature Matrix View offer a detailed analysis of individual nodes to assist users in forming hypotheses about the error patterns. Since GNNs jointly model the graph structure and the node features, we reveal the relative influences of the two types of information by comparing the predictions of three models: GNN, Multi-Layer Perceptron (MLP), and GNN Without Using Features (GNNWUF). Two case studies and interviews with domain experts demonstrate the effectiveness of GNNVis in facilitating the understanding of GNN models and their errors.

* 14 pages

Via

Access Paper or Ask Questions

DECE: Decision Explorer with Counterfactual Explanations for Machine Learning Models

Aug 19, 2020

Furui Cheng, Yao Ming, Huamin Qu

Figure 1 for DECE: Decision Explorer with Counterfactual Explanations for Machine Learning Models

Figure 2 for DECE: Decision Explorer with Counterfactual Explanations for Machine Learning Models

Figure 3 for DECE: Decision Explorer with Counterfactual Explanations for Machine Learning Models

Figure 4 for DECE: Decision Explorer with Counterfactual Explanations for Machine Learning Models

Abstract:With machine learning models being increasingly applied to various decision-making scenarios, people have spent growing efforts to make machine learning models more transparent and explainable. Among various explanation techniques, counterfactual explanations have the advantages of being human-friendly and actionable -- a counterfactual explanation tells the user how to gain the desired prediction with minimal changes to the input. Besides, counterfactual explanations can also serve as efficient probes to the models' decisions. In this work, we exploit the potential of counterfactual explanations to understand and explore the behavior of machine learning models. We design DECE, an interactive visualization system that helps understand and explore a model's decisions on individual instances and data subsets, supporting users ranging from decision-subjects to model developers. DECE supports exploratory analysis of model decisions by combining the strengths of counterfactual explanations at instance- and subgroup-levels. We also introduce a set of interactions that enable users to customize the generation of counterfactual explanations to find more actionable ones that can suit their needs. Through three use cases and an expert interview, we demonstrate the effectiveness of DECE in supporting decision exploration tasks and instance explanations.

* 10 pages, 7 figures. The paper will be published on IEEE Transactions on Visualization and Computer Graphics (TVCG)

Via

Access Paper or Ask Questions

Peer-inspired Student Performance Prediction in Interactive Online Question Pools with Graph Neural Network

Aug 15, 2020

Haotian Li, Huan Wei, Yong Wang, Yangqiu Song, Huamin Qu

Figure 1 for Peer-inspired Student Performance Prediction in Interactive Online Question Pools with Graph Neural Network

Figure 2 for Peer-inspired Student Performance Prediction in Interactive Online Question Pools with Graph Neural Network

Figure 3 for Peer-inspired Student Performance Prediction in Interactive Online Question Pools with Graph Neural Network

Figure 4 for Peer-inspired Student Performance Prediction in Interactive Online Question Pools with Graph Neural Network

Abstract:Student performance prediction is critical to online education. It can benefit many downstream tasks on online learning platforms, such as estimating dropout rates, facilitating strategic intervention, and enabling adaptive online learning. Interactive online question pools provide students with interesting interactive questions to practice their knowledge in online education. However, little research has been done on student performance prediction in interactive online question pools. Existing work on student performance prediction targets at online learning platforms with predefined course curriculum and accurate knowledge labels like MOOC platforms, but they are not able to fully model knowledge evolution of students in interactive online question pools. In this paper, we propose a novel approach using Graph Neural Networks (GNNs) to achieve better student performance prediction in interactive online question pools. Specifically, we model the relationship between students and questions using student interactions to construct the student-interaction-question network and further present a new GNN model, called R^2GCN, which intrinsically works for the heterogeneous networks, to achieve generalizable student performance prediction in interactive online question pools. We evaluate the effectiveness of our approach on a real-world dataset consisting of 104,113 mouse trajectories generated in the problem-solving process of over 4000 students on 1631 questions. The experiment results show that our approach can achieve a much higher accuracy of student performance prediction than both traditional machine learning approaches and GNN models.

* 8 pages, 8 figures. Accepted at CIKM 2020

Via

Access Paper or Ask Questions

Visual Analysis of Discrimination in Machine Learning

Jul 30, 2020

Qianwen Wang, Zhenhua Xu, Zhutian Chen, Yong Wang, Shixia Liu, Huamin Qu

Figure 1 for Visual Analysis of Discrimination in Machine Learning

Figure 2 for Visual Analysis of Discrimination in Machine Learning

Figure 3 for Visual Analysis of Discrimination in Machine Learning

Figure 4 for Visual Analysis of Discrimination in Machine Learning

Abstract:The growing use of automated decision-making in critical applications, such as crime prediction and college admission, has raised questions about fairness in machine learning. How can we decide whether different treatments are reasonable or discriminatory? In this paper, we investigate discrimination in machine learning from a visual analytics perspective and propose an interactive visualization tool, DiscriLens, to support a more comprehensive analysis. To reveal detailed information on algorithmic discrimination, DiscriLens identifies a collection of potentially discriminatory itemsets based on causal modeling and classification rules mining. By combining an extended Euler diagram with a matrix-based visualization, we develop a novel set visualization to facilitate the exploration and interpretation of discriminatory itemsets. A user study shows that users can interpret the visually encoded information in DiscriLens quickly and accurately. Use cases demonstrate that DiscriLens provides informative guidance in understanding and reducing algorithmic discrimination.

Via

Access Paper or Ask Questions

HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models

Feb 12, 2020

Qianwen Wang, William Alexander, Jack Pegg, Huamin Qu, Min Chen

Figure 1 for HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models

Figure 2 for HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models

Figure 3 for HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models

Figure 4 for HypoML: Visual Analysis for Hypothesis-based Evaluation of Machine Learning Models

Abstract:In this paper, we present a visual analytics tool for enabling hypothesis-based evaluation of machine learning (ML) models. We describe a novel ML-testing framework that combines the traditional statistical hypothesis testing (commonly used in empirical research) with logical reasoning about the conclusions of multiple hypotheses. The framework defines a controlled configuration for testing a number of hypotheses as to whether and how some extra information about a "concept" or "feature" may benefit or hinder a ML model. Because reasoning multiple hypotheses is not always straightforward, we provide HypoML as a visual analysis tool, with which, the multi-thread testing data is transformed to a visual representation for rapid observation of the conclusions and the logical flow between the testing data and hypotheses.We have applied HypoML to a number of hypothesized concepts, demonstrating the intuitive and explainable nature of the visual analysis.

* This article was submitted to EuroVis 2020 on 5 December 2020. It was not accepted. Because the reviews have not identified any technical problems that would undermine the novelty and validity of this work, we think that the article is ready to be released as an arXiv report. The EuroVis 2020 reviews and authors' short feedback can be found in the anc folder

Via

Access Paper or Ask Questions

VoiceCoach: Interactive Evidence-based Training for Voice Modulation Skills in Public Speaking

Jan 22, 2020

Xingbo Wang, Haipeng Zeng, Yong Wang, Aoyu Wu, Zhida Sun, Xiaojuan Ma, Huamin Qu

Figure 1 for VoiceCoach: Interactive Evidence-based Training for Voice Modulation Skills in Public Speaking

Figure 2 for VoiceCoach: Interactive Evidence-based Training for Voice Modulation Skills in Public Speaking

Figure 3 for VoiceCoach: Interactive Evidence-based Training for Voice Modulation Skills in Public Speaking

Figure 4 for VoiceCoach: Interactive Evidence-based Training for Voice Modulation Skills in Public Speaking

Abstract:The modulation of voice properties, such as pitch, volume, and speed, is crucial for delivering a successful public speech. However, it is challenging to master different voice modulation skills. Though many guidelines are available, they are often not practical enough to be applied in different public speaking situations, especially for novice speakers. We present VoiceCoach, an interactive evidence-based approach to facilitate the effective training of voice modulation skills. Specifically, we have analyzed the voice modulation skills from 2623 high-quality speeches (i.e., TED Talks) and use them as the benchmark dataset. Given a voice input, VoiceCoach automatically recommends good voice modulation examples from the dataset based on the similarity of both sentence structures and voice modulation skills. Immediate and quantitative visual feedback is provided to guide further improvement. The expert interviews and the user study provide support for the effectiveness and usability of VoiceCoach.

* Accepted by CHI '20

Via

Access Paper or Ask Questions