Get our free extension to see links to code for papers anywhere online!

Chrome logo Add to Chrome

Firefox logo Add to Firefox

"Topic": models, code, and papers

Clinical evaluation of semi-automatic opensource algorithmic software segmentation of the mandibular bone: Practical feasibility and assessment of a new course of action

May 11, 2018
Jürgen Wallner, Kerstin Hochegger, Xiaojun Chen, Irene Mischak, Knut Reinbacher, Mauro Pau, Tomislav Zrnc, Katja Schwenzer-Zimmerer, Wolfgang Zemann, Dieter Schmalstieg, Jan Egger

Computer assisted technologies based on algorithmic software segmentation are an increasing topic of interest in complex surgical cases. However - due to functional instability, time consuming software processes, personnel resources or licensed-based financial costs many segmentation processes are often outsourced from clinical centers to third parties and the industry. Therefore, the aim of this trial was to assess the practical feasibility of an easy available, functional stable and licensed-free segmentation approach to be used in the clinical practice. In this retrospective, randomized, controlled trail the accuracy and accordance of the open-source based segmentation algorithm GrowCut (GC) was assessed through the comparison to the manually generated ground truth of the same anatomy using 10 CT lower jaw data-sets from the clinical routine. Assessment parameters were the segmentation time, the volume, the voxel number, the Dice Score (DSC) and the Hausdorff distance (HD). Overall segmentation times were about one minute. Mean DSC values of over 85% and HD below 33.5 voxel could be achieved. Statistical differences between the assessment parameters were not significant (p<0.05) and correlation coefficients were close to the value one (r > 0.94). Complete functional stable and time saving segmentations with high accuracy and high positive correlation could be performed by the presented interactive open-source based approach. In the cranio-maxillofacial complex the used method could represent an algorithmic alternative for image-based segmentation in the clinical practice for e.g. surgical treatment planning or visualization of postoperative results and offers several advantages. Systematic comparisons to other segmentation approaches or with a greater data amount are areas of future works.

* Wallner J, et al. (2018) Clinical evaluation of semi-automatic open-source algorithmic software segmentation of the mandibular bone: Practical feasibility and assessment of a new course of action. PLoS ONE 13(5):e0196378 
* 26 pages 

  Access Paper or Ask Questions

Texture Classification of MR Images of the Brain in ALS using CoHOG

Sep 25, 2017
G M Mashrur E Elahi, Sanjay Kalra, Yee-Hong Yang

Texture analysis is a well-known research topic in computer vision and image processing and has many applications. Gradient-based texture methods have become popular in classification problems. For the first time we extend a well-known gradient-based method, Co-occurrence Histograms of Oriented Gradients (CoHOG) to extract texture features from 2D Magnetic Resonance Images (MRI). Unlike the original CoHOG method, we use the whole image instead of sub-regions for feature calculation. Also, we use a larger neighborhood size. Gradient orientations of the image pixels are calculated using Sobel, Gaussian Derivative (GD) and Local Frequency Descriptor Gradient (LFDG) operators. The extracted feature vector size is very large and classification using a large number of similar features does not provide the best results. In our proposed method, for the first time to our best knowledge, only a minimum number of significant features are selected using area under the receiver operator characteristic (ROC) curve (AUC) thresholds with <= 0.01. In this paper, we apply the proposed method to classify Amyotrophic Lateral Sclerosis (ALS) patients from the controls. It is observed that selected texture features from downsampled images are significantly different between patients and controls. These features are used in a linear support vector machine (SVM) classifier to determine the classification accuracy. Optimal sensitivity and specificity are also calculated. Three different cohort datasets are used in the experiments. The performance of the proposed method using three gradient operators and two different neighborhood sizes is analyzed. Region based analysis is performed to demonstrate that significant changes between patients and controls are limited to the motor cortex.

* We found an error in the feature selection part (Sec. 3.3) of the proposed approach. In the feature selection part, by mistake, we have used both the training and testing data for feature selection. We are working on it. We will update the proposed approach as early as possible 

  Access Paper or Ask Questions

The Roles and Modes of Human Interactions with Automated Machine Learning Systems

May 09, 2022
Thanh Tung Khuat, David Jacob Kedziora, Bogdan Gabrys

As automated machine learning (AutoML) systems continue to progress in both sophistication and performance, it becomes important to understand the `how' and `why' of human-computer interaction (HCI) within these frameworks, both current and expected. Such a discussion is necessary for optimal system design, leveraging advanced data-processing capabilities to support decision-making involving humans, but it is also key to identifying the opportunities and risks presented by ever-increasing levels of machine autonomy. Within this context, we focus on the following questions: (i) How does HCI currently look like for state-of-the-art AutoML algorithms, especially during the stages of development, deployment, and maintenance? (ii) Do the expectations of HCI within AutoML frameworks vary for different types of users and stakeholders? (iii) How can HCI be managed so that AutoML solutions acquire human trust and broad acceptance? (iv) As AutoML systems become more autonomous and capable of learning from complex open-ended environments, will the fundamental nature of HCI evolve? To consider these questions, we project existing literature in HCI into the space of AutoML; this connection has, to date, largely been unexplored. In so doing, we review topics including user-interface design, human-bias mitigation, and trust in artificial intelligence (AI). Additionally, to rigorously gauge the future of HCI, we contemplate how AutoML may manifest in effectively open-ended environments. This discussion necessarily reviews projected developmental pathways for AutoML, such as the incorporation of reasoning, although the focus remains on how and why HCI may occur in such a framework rather than on any implementational details. Ultimately, this review serves to identify key research directions aimed at better facilitating the roles and modes of human interactions with both current and future AutoML systems.

* Submitted to Foundations and Trends in Human-Computer Interaction 

  Access Paper or Ask Questions

BioADAPT-MRC: Adversarial Learning-based Domain Adaptation Improves Biomedical Machine Reading Comprehension Task

Feb 26, 2022
Maria Mahbub, Sudarshan Srinivasan, Edmon Begoli, Gregory D Peterson

Motivation: Biomedical machine reading comprehension (biomedical-MRC) aims to comprehend complex biomedical narratives and assist healthcare professionals in retrieving information from them. The high performance of modern neural network-based MRC systems depends on high-quality, large-scale, human-annotated training datasets. In the biomedical domain, a crucial challenge in creating such datasets is the requirement for domain knowledge, inducing the scarcity of labeled data and the need for transfer learning from the labeled general-purpose (source) domain to the biomedical (target) domain. However, there is a discrepancy in marginal distributions between the general-purpose and biomedical domains due to the variances in topics. Therefore, direct-transferring of learned representations from a model trained on a general-purpose domain to the biomedical domain can hurt the model's performance. Results: We present an adversarial learning-based domain adaptation framework for the biomedical machine reading comprehension task (BioADAPT-MRC), a neural network-based method to address the discrepancies in the marginal distributions between the general and biomedical domain datasets. BioADAPT-MRC relaxes the need for generating pseudo labels for training a well-performing biomedical-MRC model. We extensively evaluate the performance of BioADAPT-MRC by comparing it with the best existing methods on three widely used benchmark biomedical-MRC datasets -- BioASQ-7b, BioASQ-8b, and BioASQ-9b. Our results suggest that without using any synthetic or human-annotated data from the biomedical domain, BioADAPT-MRC can achieve state-of-the-art performance on these datasets. Availability: BioADAPT-MRC is freely available as an open-source project at\\

  Access Paper or Ask Questions

Security Orchestration, Automation, and Response Engine for Deployment of Behavioural Honeypots

Jan 14, 2022
Upendra Bartwal, Subhasis Mukhopadhyay, Rohit Negi, Sandeep Shukla

Cyber Security is a critical topic for organizations with IT/OT networks as they are always susceptible to attack, whether insider or outsider. Since the cyber landscape is an ever-evolving scenario, one must keep upgrading its security systems to enhance the security of the infrastructure. Tools like Security Information and Event Management (SIEM), Endpoint Detection and Response (EDR), Threat Intelligence Platform (TIP), Information Technology Service Management (ITSM), along with other defensive techniques like Intrusion Detection System (IDS), Intrusion Protection System (IPS), and many others enhance the cyber security posture of the infrastructure. However, the proposed protection mechanisms have their limitations, they are insufficient to ensure security, and the attacker penetrates the network. Deception technology, along with Honeypots, provides a false sense of vulnerability in the target systems to the attackers. The attacker deceived reveals threat intel about their modus operandi. We have developed a Security Orchestration, Automation, and Response (SOAR) Engine that dynamically deploys custom honeypots inside the internal network infrastructure based on the attacker's behavior. The architecture is robust enough to support multiple VLANs connected to the system and used for orchestration. The presence of botnet traffic and DDOS attacks on the honeypots in the network is detected, along with a malware collection system. After being exposed to live traffic for four days, our engine dynamically orchestrated the honeypots 40 times, detected 7823 attacks, 965 DDOS attack packets, and three malicious samples. While our experiments with static honeypots show an average attacker engagement time of 102 seconds per instance, our SOAR Engine-based dynamic honeypots engage attackers on average 3148 seconds.

* SOAR Engine for Honeypots Deployment, 8 pages, 7 figures 

  Access Paper or Ask Questions

Towards a Science of Human-AI Decision Making: A Survey of Empirical Studies

Dec 21, 2021
Vivian Lai, Chacha Chen, Q. Vera Liao, Alison Smith-Renner, Chenhao Tan

As AI systems demonstrate increasingly strong predictive performance, their adoption has grown in numerous domains. However, in high-stakes domains such as criminal justice and healthcare, full automation is often not desirable due to safety, ethical, and legal concerns, yet fully manual approaches can be inaccurate and time consuming. As a result, there is growing interest in the research community to augment human decision making with AI assistance. Besides developing AI technologies for this purpose, the emerging field of human-AI decision making must embrace empirical approaches to form a foundational understanding of how humans interact and work with AI to make decisions. To invite and help structure research efforts towards a science of understanding and improving human-AI decision making, we survey recent literature of empirical human-subject studies on this topic. We summarize the study design choices made in over 100 papers in three important aspects: (1) decision tasks, (2) AI models and AI assistance elements, and (3) evaluation metrics. For each aspect, we summarize current trends, discuss gaps in current practices of the field, and make a list of recommendations for future research. Our survey highlights the need to develop common frameworks to account for the design and research spaces of human-AI decision making, so that researchers can make rigorous choices in study design, and the research community can build on each other's work and produce generalizable scientific knowledge. We also hope this survey will serve as a bridge for HCI and AI communities to work together to mutually shape the empirical science and computational technologies for human-AI decision making.

* 36 pages, 2 figures, see for website 

  Access Paper or Ask Questions

Towards High-Quality Temporal Action Detection with Sparse Proposals

Sep 18, 2021
Jiannan Wu, Peize Sun, Shoufa Chen, Jiewen Yang, Zihao Qi, Lan Ma, Ping Luo

Temporal Action Detection (TAD) is an essential and challenging topic in video understanding, aiming to localize the temporal segments containing human action instances and predict the action categories. The previous works greatly rely upon dense candidates either by designing varying anchors or enumerating all the combinations of boundaries on video sequences; therefore, they are related to complicated pipelines and sensitive hand-crafted designs. Recently, with the resurgence of Transformer, query-based methods have tended to become the rising solutions for their simplicity and flexibility. However, there still exists a performance gap between query-based methods and well-established methods. In this paper, we identify the main challenge lies in the large variants of action duration and the ambiguous boundaries for short action instances; nevertheless, quadratic-computational global attention prevents query-based methods to build multi-scale feature maps. Towards high-quality temporal action detection, we introduce Sparse Proposals to interact with the hierarchical features. In our method, named SP-TAD, each proposal attends to a local segment feature in the temporal feature pyramid. The local interaction enables utilization of high-resolution features to preserve action instances details. Extensive experiments demonstrate the effectiveness of our method, especially under high tIoU thresholds. E.g., we achieve the state-of-the-art performance on THUMOS14 (45.7% on [email protected], 33.4% on [email protected] and 53.5% on [email protected]) and competitive results on ActivityNet-1.3 (32.99% on [email protected]). Code will be made available at

* 10 pages, 5 figures 

  Access Paper or Ask Questions

Taxonomizing local versus global structure in neural network loss landscapes

Jul 23, 2021
Yaoqing Yang, Liam Hodgkinson, Ryan Theisen, Joe Zou, Joseph E. Gonzalez, Kannan Ramchandran, Michael W. Mahoney

Viewing neural network models in terms of their loss landscapes has a long history in the statistical mechanics approach to learning, and in recent years it has received attention within machine learning proper. Among other things, local metrics (such as the smoothness of the loss landscape) have been shown to correlate with global properties of the model (such as good generalization). Here, we perform a detailed empirical analysis of the loss landscape structure of thousands of neural network models, systematically varying learning tasks, model architectures, and/or quantity/quality of data. By considering a range of metrics that attempt to capture different aspects of the loss landscape, we demonstrate that the best test accuracy is obtained when: the loss landscape is globally well-connected; ensembles of trained models are more similar to each other; and models converge to locally smooth regions. We also show that globally poorly-connected landscapes can arise when models are small or when they are trained to lower quality data; and that, if the loss landscape is globally poorly-connected, then training to zero loss can actually lead to worse test accuracy. Based on these results, we develop a simple one-dimensional model with load-like and temperature-like parameters, we introduce the notion of an \emph{effective loss landscape} depending on these parameters, and we interpret our results in terms of a \emph{rugged convexity} of the loss landscape. When viewed through this lens, our detailed empirical results shed light on phases of learning (and consequent double descent behavior), fundamental versus incidental determinants of good generalization, the role of load-like and temperature-like parameters in the learning process, different influences on the loss landscape from model and data, and the relationships between local and global metrics, all topics of recent interest.

  Access Paper or Ask Questions

TagRec: Automated Tagging of Questions with Hierarchical Learning Taxonomy

Jul 03, 2021
Venktesh V, Mukesh Mohania, Vikram Goyal

Online educational platforms organize academic questions based on a hierarchical learning taxonomy (subject-chapter-topic). Automatically tagging new questions with existing taxonomy will help organize these questions into different classes of hierarchical taxonomy so that they can be searched based on the facets like chapter. This task can be formulated as a flat multi-class classification problem. Usually, flat classification based methods ignore the semantic relatedness between the terms in the hierarchical taxonomy and the questions. Some traditional methods also suffer from the class imbalance issues as they consider only the leaf nodes ignoring the hierarchy. Hence, we formulate the problem as a similarity-based retrieval task where we optimize the semantic relatedness between the taxonomy and the questions. We demonstrate that our method helps to handle the unseen labels and hence can be used for taxonomy tagging in the wild. In this method, we augment the question with its corresponding answer to capture more semantic information and then align the question-answer pair's contextualized embedding with the corresponding label (taxonomy) vector representations. The representations are aligned by fine-tuning a transformer based model with a loss function that is a combination of the cosine similarity and hinge rank loss. The loss function maximizes the similarity between the question-answer pair and the correct label representations and minimizes the similarity to unrelated labels. Finally, we perform experiments on two real-world datasets. We show that the proposed learning method outperforms representations learned using the multi-class classification method and other state of the art methods by 6% as measured by [email protected] We also demonstrate the performance of the proposed method on unseen but related learning content like the learning objectives without re-training the network.

* 16 pages, accepted at ECML-PKDD 2021 

  Access Paper or Ask Questions

KLUE: Korean Language Understanding Evaluation

Jun 11, 2021
Sungjoon Park, Jihyung Moon, Sungdong Kim, Won Ik Cho, Jiyoon Han, Jangwon Park, Chisung Song, Junseong Kim, Yongsook Song, Taehwan Oh, Joohong Lee, Juhyun Oh, Sungwon Lyu, Younghoon Jeong, Inkwon Lee, Sangwoo Seo, Dongjun Lee, Hyunwoo Kim, Myeonghwa Lee, Seongbo Jang, Seungwon Do, Sunkyoung Kim, Kyungtae Lim, Jongwon Lee, Kyumin Park, Jamin Shin, Seonghyun Kim, Lucy Park, Alice Oh, Jung-Woo Ha, Kyunghyun Cho

We introduce Korean Language Understanding Evaluation (KLUE) benchmark. KLUE is a collection of 8 Korean natural language understanding (NLU) tasks, including Topic Classification, SemanticTextual Similarity, Natural Language Inference, Named Entity Recognition, Relation Extraction, Dependency Parsing, Machine Reading Comprehension, and Dialogue State Tracking. We build all of the tasks from scratch from diverse source corpora while respecting copyrights, to ensure accessibility for anyone without any restrictions. With ethical considerations in mind, we carefully design annotation protocols. Along with the benchmark tasks and data, we provide suitable evaluation metrics and fine-tuning recipes for pretrained language models for each task. We furthermore release the pretrained language models (PLM), KLUE-BERT and KLUE-RoBERTa, to help reproducing baseline models on KLUE and thereby facilitate future research. We make a few interesting observations from the preliminary experiments using the proposed KLUE benchmark suite, already demonstrating the usefulness of this new benchmark suite. First, we find KLUE-RoBERTa-large outperforms other baselines, including multilingual PLMs and existing open-source Korean PLMs. Second, we see minimal degradation in performance even when we replace personally identifiable information from the pretraining corpus, suggesting that privacy and NLU capability are not at odds with each other. Lastly, we find that using BPE tokenization in combination with morpheme-level pre-tokenization is effective in tasks involving morpheme-level tagging, detection and generation. In addition to accelerating Korean NLP research, our comprehensive documentation on creating KLUE will facilitate creating similar resources for other languages in the future. KLUE is available at

* 76 pages, 10 figures, 36 tables 

  Access Paper or Ask Questions