Alert button
Picture for Young-Ho Kim

Young-Ho Kim

Alert button

Computational Approaches for App-to-App Retrieval and Design Consistency Check

Sep 19, 2023
Seokhyeon Park, Wonjae Kim, Young-Ho Kim, Jinwook Seo

Extracting semantic representations from mobile user interfaces (UI) and using the representations for designers' decision-making processes have shown the potential to be effective computational design support tools. Current approaches rely on machine learning models trained on small-sized mobile UI datasets to extract semantic vectors and use screenshot-to-screenshot comparison to retrieve similar-looking UIs given query screenshots. However, the usability of these methods is limited because they are often not open-sourced and have complex training pipelines for practitioners to follow, and are unable to perform screenshot set-to-set (i.e., app-to-app) retrieval. To this end, we (1) employ visual models trained with large web-scale images and test whether they could extract a UI representation in a zero-shot way and outperform existing specialized models, and (2) use mathematically founded methods to enable app-to-app retrieval and design consistency analysis. Our experiments show that our methods not only improve upon previous retrieval models but also enable multiple new applications.

* AI & HCI Workshop at the ICML 2023 
Viaarxiv icon

Designing a Direct Feedback Loop between Humans and Convolutional Neural Networks through Local Explanations

Jul 08, 2023
Tong Steven Sun, Yuyang Gao, Shubham Khaladkar, Sijia Liu, Liang Zhao, Young-Ho Kim, Sungsoo Ray Hong

Figure 1 for Designing a Direct Feedback Loop between Humans and Convolutional Neural Networks through Local Explanations
Figure 2 for Designing a Direct Feedback Loop between Humans and Convolutional Neural Networks through Local Explanations
Figure 3 for Designing a Direct Feedback Loop between Humans and Convolutional Neural Networks through Local Explanations
Figure 4 for Designing a Direct Feedback Loop between Humans and Convolutional Neural Networks through Local Explanations

The local explanation provides heatmaps on images to explain how Convolutional Neural Networks (CNNs) derive their output. Due to its visual straightforwardness, the method has been one of the most popular explainable AI (XAI) methods for diagnosing CNNs. Through our formative study (S1), however, we captured ML engineers' ambivalent perspective about the local explanation as a valuable and indispensable envision in building CNNs versus the process that exhausts them due to the heuristic nature of detecting vulnerability. Moreover, steering the CNNs based on the vulnerability learned from the diagnosis seemed highly challenging. To mitigate the gap, we designed DeepFuse, the first interactive design that realizes the direct feedback loop between a user and CNNs in diagnosing and revising CNN's vulnerability using local explanations. DeepFuse helps CNN engineers to systemically search "unreasonable" local explanations and annotate the new boundaries for those identified as unreasonable in a labor-efficient manner. Next, it steers the model based on the given annotation such that the model doesn't introduce similar mistakes. We conducted a two-day study (S2) with 12 experienced CNN engineers. Using DeepFuse, participants made a more accurate and "reasonable" model than the current state-of-the-art. Also, participants found the way DeepFuse guides case-based reasoning can practically improve their current practice. We provide implications for design that explain how future HCI-driven design can move our practice forward to make XAI-driven insights more actionable.

* 32 pages, 6 figures, 5 tables. Accepted for publication in the Proceedings of the ACM on Human-Computer Interaction (PACM HCI), CSCW 2023 
Viaarxiv icon

Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation

May 23, 2023
Takyoung Kim, Jamin Shin, Young-Ho Kim, Sanghwan Bae, Sungdong Kim

Figure 1 for Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation
Figure 2 for Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation
Figure 3 for Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation
Figure 4 for Revealing User Familiarity Bias in Task-Oriented Dialogue via Interactive Evaluation

Most task-oriented dialogue (TOD) benchmarks assume users that know exactly how to use the system by constraining the user behaviors within the system's capabilities via strict user goals, namely "user familiarity" bias. This data bias deepens when it combines with data-driven TOD systems, as it is impossible to fathom the effect of it with existing static evaluations. Hence, we conduct an interactive user study to unveil how vulnerable TOD systems are against realistic scenarios. In particular, we compare users with 1) detailed goal instructions that conform to the system boundaries (closed-goal) and 2) vague goal instructions that are often unsupported but realistic (open-goal). Our study reveals that conversations in open-goal settings lead to catastrophic failures of the system, in which 92% of the dialogues had significant issues. Moreover, we conduct a thorough analysis to identify distinctive features between the two settings through error annotation. From this, we discover a novel "pretending" behavior, in which the system pretends to handle the user requests even though they are beyond the system's capabilities. We discuss its characteristics and toxicity while emphasizing transparency and a fallback strategy for robust TOD systems.

Viaarxiv icon

AI-based Agents for Automated Robotic Endovascular Guidewire Manipulation

Apr 18, 2023
Young-Ho Kim, Èric Lluch, Gulsun Mehmet, Florin C. Ghesu, Ankur Kapoor

Figure 1 for AI-based Agents for Automated Robotic Endovascular Guidewire Manipulation
Figure 2 for AI-based Agents for Automated Robotic Endovascular Guidewire Manipulation
Figure 3 for AI-based Agents for Automated Robotic Endovascular Guidewire Manipulation
Figure 4 for AI-based Agents for Automated Robotic Endovascular Guidewire Manipulation

Endovascular guidewire manipulation is essential for minimally-invasive clinical applications (Percutaneous Coronary Intervention (PCI), Mechanical thrombectomy techniques for acute ischemic stroke (AIS), or Transjugular intrahepatic portosystemic shunt (TIPS)). All procedures commonly require 3D vessel geometries from 3D CTA (Computed Tomography Angiography) images. During these procedures, the clinician generally places a guiding catheter in the ostium of the relevant vessel and then manipulates a wire through the catheter and across the blockage. The clinician only uses X-ray fluoroscopy intermittently to visualize and guide the catheter, guidewire, and other devices. However, clinicians still passively control guidewires/catheters by relying on limited indirect observation (i.e., 2D partial view of devices, and intermittent updates due to radiation limit) from X-ray fluoroscopy. Modeling and controlling the guidewire manipulation in coronary vessels remains challenging because of the complicated interaction between guidewire motions with different physical properties (i.e., loads, coating) and vessel geometries with lumen conditions resulting in a highly non-linear system. This paper introduces a scalable learning pipeline to train AI-based agent models toward automated endovascular predictive device controls. First, we create a scalable environment by pre-processing 3D CTA images, providing patient-specific 3D vessel geometry and the centerline of the coronary. Next, we apply a large quantity of randomly generated motion sequences from the proximal end to generate wire states associated with each environment using a physics-based device simulator. Then, we reformulate the control problem to a sequence-to-sequence learning problem, in which we use a Transformer-based model, trained to handle non-linear sequential forward/inverse transition functions.

Viaarxiv icon

Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data

Jan 14, 2023
Jing Wei, Sungdong Kim, Hyunhoon Jung, Young-Ho Kim

Figure 1 for Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data
Figure 2 for Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data
Figure 3 for Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data
Figure 4 for Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data

Large language models (LLMs) provide a new way to build chatbots by accepting natural language prompts. Yet, it is unclear how to design prompts to power chatbots to carry on naturalistic conversations while pursuing a given goal, such as collecting self-report data from users. We explore what design factors of prompts can help steer chatbots to talk naturally and collect data reliably. To this aim, we formulated four prompt designs with different structures and personas. Through an online study (N = 48) where participants conversed with chatbots driven by different designs of prompts, we assessed how prompt designs and conversation topics affected the conversation flows and users' perceptions of chatbots. Our chatbots covered 79% of the desired information slots during conversations, and the designs of prompts and topics significantly influenced the conversation flows and the data collection performance. We discuss the opportunities and challenges of building chatbots with LLMs.

* 22 pages including Appendix, 7 figures, 7 tables 
Viaarxiv icon

Design, Modeling, and Evaluation of Separable Tendon-Driven Robotic Manipulator with Long, Passive, Flexible Proximal Section

Jan 01, 2023
Christian DeBuys, Florin C. Ghesu, Jagadeesan Jayender, Reza Langari, Young-Ho Kim

Figure 1 for Design, Modeling, and Evaluation of Separable Tendon-Driven Robotic Manipulator with Long, Passive, Flexible Proximal Section
Figure 2 for Design, Modeling, and Evaluation of Separable Tendon-Driven Robotic Manipulator with Long, Passive, Flexible Proximal Section
Figure 3 for Design, Modeling, and Evaluation of Separable Tendon-Driven Robotic Manipulator with Long, Passive, Flexible Proximal Section
Figure 4 for Design, Modeling, and Evaluation of Separable Tendon-Driven Robotic Manipulator with Long, Passive, Flexible Proximal Section

The purpose of this work was to tackle practical issues which arise when using a tendon-driven robotic manipulator with a long, passive, flexible proximal section in medical applications. A separable robot which overcomes difficulties in actuation and sterilization is introduced, in which the body containing the electronics is reusable and the remainder is disposable. A control input which resolves the redundancy in the kinematics and a physical interpretation of this redundancy are provided. The effect of a static change in the proximal section angle on bending angle error was explored under four testing conditions for a sinusoidal input. Bending angle error increased for increasing proximal section angle for all testing conditions with an average error reduction of 41.48% for retension, 4.28% for hysteresis, and 52.35% for re-tension + hysteresis compensation relative to the baseline case. Two major sources of error in tracking the bending angle were identified: time delay from hysteresis and DC offset from the proximal section angle. Examination of these error sources revealed that the simple hysteresis compensation was most effective for removing time delay and re-tension compensation for removing DC offset, which was the primary source of increasing error. The re-tension compensation was also tested for dynamic changes in the proximal section and reduced error in the final configuration of the tip by 89.14% relative to the baseline case.

Viaarxiv icon

Leveraging Pre-Trained Language Models to Streamline Natural Language Interaction for Self-Tracking

Jun 07, 2022
Young-Ho Kim, Sungdong Kim, Minsuk Chang, Sang-Woo Lee

Figure 1 for Leveraging Pre-Trained Language Models to Streamline Natural Language Interaction for Self-Tracking
Figure 2 for Leveraging Pre-Trained Language Models to Streamline Natural Language Interaction for Self-Tracking
Figure 3 for Leveraging Pre-Trained Language Models to Streamline Natural Language Interaction for Self-Tracking

Current natural language interaction for self-tracking tools largely depends on bespoke implementation optimized for a specific tracking theme and data format, which is neither generalizable nor scalable to a tremendous design space of self-tracking. However, training machine learning models in the context of self-tracking is challenging due to the wide variety of tracking topics and data formats. In this paper, we propose a novel NLP task for self-tracking that extracts close- and open-ended information from a retrospective activity log described as a plain text, and a domain-agnostic, GPT-3-based NLU framework that performs this task. The framework augments the prompt using synthetic samples to transform the task into 10-shot learning, to address a cold-start problem in bootstrapping a new tracking topic. Our preliminary evaluation suggests that our approach significantly outperforms the baseline QA models. Going further, we discuss future application domains toward which the NLP and HCI researchers can collaborate.

* Accepted to NAACL '22 2nd Workshop on Bridging Human-Computer Interaction and Natural Language Processing. 10 pages including appendix, 2 figures, and 1 table 
Viaarxiv icon

Design and validation of zero-slack separable manipulator for Intracardiac Echocardiography

Apr 01, 2022
Christian DeBuy, Florin Ghesu, Reza Langari, Young-Ho Kim

Figure 1 for Design and validation of zero-slack separable manipulator for Intracardiac Echocardiography
Figure 2 for Design and validation of zero-slack separable manipulator for Intracardiac Echocardiography
Figure 3 for Design and validation of zero-slack separable manipulator for Intracardiac Echocardiography

Clinicians require substantial training and experience to become comfortable with steering Intracardiac echocardiography (ICE) catheter to localize and measure the area of treatment to watch for complications while device catheters are deployed in another access. Thus, it is reasonable that a robotic-assist system to hold and actively manipulate the ICE catheter could ease the workload of the physician. Existing commercially-available robotic systems and research prototypes all use existing commercially available ICE catheters based on multiple tendon-sheath mechanism (TSM). To motorize the existing TSM-based ICE catheter, the actuators interface with the outer handle knobs to manipulate four internal tendons. However, in practice, the actuators are located at a sterile, safe place far away from the ICE handle. Thus, to interface with knobs, there exist multiple coupled gear structures between two, leading to a highly nonlinear behavior (e.g. various slack, elasticity) alongside hysteresis phenomena in TSM. Since ICE catheters are designed for single use, the expensive actuators need to be located in a safe place so as to be reusable. Moreover, these actuators should interface as directly as possible with the tendons for accurate tip controls. In this paper, we introduce a separable ICE catheter robot with four tendon actuation: one part reusable and another disposable. Moreover, we propose a practical model and calibration method for our proposed mechanism so that four tendons are actuated simultaneously allowing for precise tip control and mitigating issues with conventional devices such as dead-zone and hysteresis with simple linear compensation. We consider an open-loop controller since many available ICE catheters are used without position-tracking sensors at the tip due to costs and single use

Viaarxiv icon

MyMove: Facilitating Older Adults to Collect In-Situ Activity Labels on a Smartwatch with Speech

Apr 01, 2022
Young-Ho Kim, Diana Chou, Bongshin Lee, Margaret Danilovich, Amanda Lazar, David E. Conroy, Hernisa Kacorri, Eun Kyoung Choe

Figure 1 for MyMove: Facilitating Older Adults to Collect In-Situ Activity Labels on a Smartwatch with Speech
Figure 2 for MyMove: Facilitating Older Adults to Collect In-Situ Activity Labels on a Smartwatch with Speech
Figure 3 for MyMove: Facilitating Older Adults to Collect In-Situ Activity Labels on a Smartwatch with Speech
Figure 4 for MyMove: Facilitating Older Adults to Collect In-Situ Activity Labels on a Smartwatch with Speech

Current activity tracking technologies are largely trained on younger adults' data, which can lead to solutions that are not well-suited for older adults. To build activity trackers for older adults, it is crucial to collect training data with them. To this end, we examine the feasibility and challenges with older adults in collecting activity labels by leveraging speech. Specifically, we built MyMove, a speech-based smartwatch app to facilitate the in-situ labeling with a low capture burden. We conducted a 7-day deployment study, where 13 older adults collected their activity labels and smartwatch sensor data, while wearing a thigh-worn activity monitor. Participants were highly engaged, capturing 1,224 verbal reports in total. We extracted 1,885 activities with corresponding effort level and timespan, and examined the usefulness of these reports as activity labels. We discuss the implications of our approach and the collected dataset in supporting older adults through personalized activity tracking technologies.

* To appear at ACM CHI 2022. 21 pages, 3 figures, 7 tables. For the NSF funded project, visit https://mymove-collective.github.io 
Viaarxiv icon