Alert button
Picture for Jing Luo

Jing Luo

Alert button

Inter-Rater Uncertainty Quantification in Medical Image Segmentation via Rater-Specific Bayesian Neural Networks

Jun 28, 2023
Qingqiao Hu, Hao Wang, Jing Luo, Yunhao Luo, Zhiheng Zhangg, Jan S. Kirschke, Benedikt Wiestler, Bjoern Menze, Jianguo Zhang, Hongwei Bran Li

Figure 1 for Inter-Rater Uncertainty Quantification in Medical Image Segmentation via Rater-Specific Bayesian Neural Networks
Figure 2 for Inter-Rater Uncertainty Quantification in Medical Image Segmentation via Rater-Specific Bayesian Neural Networks
Figure 3 for Inter-Rater Uncertainty Quantification in Medical Image Segmentation via Rater-Specific Bayesian Neural Networks
Figure 4 for Inter-Rater Uncertainty Quantification in Medical Image Segmentation via Rater-Specific Bayesian Neural Networks

Automated medical image segmentation inherently involves a certain degree of uncertainty. One key factor contributing to this uncertainty is the ambiguity that can arise in determining the boundaries of a target region of interest, primarily due to variations in image appearance. On top of this, even among experts in the field, different opinions can emerge regarding the precise definition of specific anatomical structures. This work specifically addresses the modeling of segmentation uncertainty, known as inter-rater uncertainty. Its primary objective is to explore and analyze the variability in segmentation outcomes that can occur when multiple experts in medical imaging interpret and annotate the same images. We introduce a novel Bayesian neural network-based architecture to estimate inter-rater uncertainty in medical image segmentation. Our approach has three key advancements. Firstly, we introduce a one-encoder-multi-decoder architecture specifically tailored for uncertainty estimation, enabling us to capture the rater-specific representation of each expert involved. Secondly, we propose Bayesian modeling for the new architecture, allowing efficient capture of the inter-rater distribution, particularly in scenarios with limited annotations. Lastly, we enhance the rater-specific representation by integrating an attention module into each decoder. This module facilitates focused and refined segmentation results for each rater. We conduct extensive evaluations using synthetic and real-world datasets to validate our technical innovations rigorously. Our method surpasses existing baseline methods in five out of seven diverse tasks on the publicly available \emph{QUBIQ} dataset, considering two evaluation metrics encompassing different uncertainty aspects. Our codes, models, and the new dataset are available through our GitHub repository: https://github.com/HaoWang420/bOEMD-net .

* submitted to a journal for review 
Viaarxiv icon

Learning a Structural Causal Model for Intuition Reasoning in Conversation

May 28, 2023
Hang Chen, Bingyu Liao, Jing Luo, Wenjing Zhu, Xinyu Yang

Figure 1 for Learning a Structural Causal Model for Intuition Reasoning in Conversation
Figure 2 for Learning a Structural Causal Model for Intuition Reasoning in Conversation
Figure 3 for Learning a Structural Causal Model for Intuition Reasoning in Conversation
Figure 4 for Learning a Structural Causal Model for Intuition Reasoning in Conversation

Reasoning, a crucial aspect of NLP research, has not been adequately addressed by prevailing models including Large Language Model. Conversation reasoning, as a critical component of it, remains largely unexplored due to the absence of a well-designed cognitive model. In this paper, inspired by intuition theory on conversation cognition, we develop a conversation cognitive model (CCM) that explains how each utterance receives and activates channels of information recursively. Besides, we algebraically transformed CCM into a structural causal model (SCM) under some mild assumptions, rendering it compatible with various causal discovery methods. We further propose a probabilistic implementation of the SCM for utterance-level relation reasoning. By leveraging variational inference, it explores substitutes for implicit causes, addresses the issue of their unobservability, and reconstructs the causal representations of utterances through the evidence lower bounds. Moreover, we constructed synthetic and simulated datasets incorporating implicit causes and complete cause labels, alleviating the current situation where all available datasets are implicit-causes-agnostic. Extensive experiments demonstrate that our proposed method significantly outperforms existing methods on synthetic, simulated, and real-world datasets. Finally, we analyze the performance of CCM under latent confounders and propose theoretical ideas for addressing this currently unresolved issue.

Viaarxiv icon

Affective Reasoning at Utterance Level in Conversations: A Causal Discovery Approach

May 04, 2023
Hang Chen, Jing Luo, Xinyu Yang, Wenjing Zhu

Figure 1 for Affective Reasoning at Utterance Level in Conversations: A Causal Discovery Approach
Figure 2 for Affective Reasoning at Utterance Level in Conversations: A Causal Discovery Approach
Figure 3 for Affective Reasoning at Utterance Level in Conversations: A Causal Discovery Approach
Figure 4 for Affective Reasoning at Utterance Level in Conversations: A Causal Discovery Approach

The affective reasoning task is a set of emerging affect-based tasks in conversation, including Emotion Recognition in Conversation (ERC),Emotion-Cause Pair Extraction (ECPE), and Emotion-Cause Span Recognition (ECSR). Existing methods make various assumptions on the apparent relationship while neglecting the essential causal model due to the nonuniqueness of skeletons and unobservability of implicit causes. This paper settled down the above two problems and further proposed Conversational Affective Causal Discovery (CACD). It is a novel causal discovery method showing how to discover causal relationships in a conversation via designing a common skeleton and generating a substitute for implicit causes. CACD contains two steps: (i) building a common centering one graph node causal skeleton for all utterances in variable-length conversations; (ii) Causal Auto-Encoder (CAE) correcting the skeleton to yield causal representation through generated implicit causes and known explicit causes. Comprehensive experiments demonstrate that our novel method significantly outperforms the SOTA baselines in six affect-related datasets on the three tasks.

Viaarxiv icon

ADFF: Attention Based Deep Feature Fusion Approach for Music Emotion Recognition

Apr 12, 2022
Zi Huang, Shulei Ji, Zhilan Hu, Chuangjian Cai, Jing Luo, Xinyu Yang

Figure 1 for ADFF: Attention Based Deep Feature Fusion Approach for Music Emotion Recognition
Figure 2 for ADFF: Attention Based Deep Feature Fusion Approach for Music Emotion Recognition
Figure 3 for ADFF: Attention Based Deep Feature Fusion Approach for Music Emotion Recognition
Figure 4 for ADFF: Attention Based Deep Feature Fusion Approach for Music Emotion Recognition

Music emotion recognition (MER), a sub-task of music information retrieval (MIR), has developed rapidly in recent years. However, the learning of affect-salient features remains a challenge. In this paper, we propose an end-to-end attention-based deep feature fusion (ADFF) approach for MER. Only taking log Mel-spectrogram as input, this method uses adapted VGGNet as spatial feature learning module (SFLM) to obtain spatial features across different levels. Then, these features are fed into squeeze-and-excitation (SE) attention-based temporal feature learning module (TFLM) to get multi-level emotion-related spatial-temporal features (ESTFs), which can discriminate emotions well in the final emotion space. In addition, a novel data processing is devised to cut the single-channel input into multi-channel to improve calculative efficiency while ensuring the quality of MER. Experiments show that our proposed method achieves 10.43% and 4.82% relative improvement of valence and arousal respectively on the R2 score compared to the state-of-the-art model, meanwhile, performs better on datasets with distinct scales and in multi-task learning.

* It has been submitted in Interspeech2022 
Viaarxiv icon

Asymptotic in a class of network models with sub-Gamma perturbations

Nov 02, 2021
Jiaxin Guo, Haoyu Wei, Xiaoyu Lei, Jing Luo

Figure 1 for Asymptotic in a class of network models with sub-Gamma perturbations
Figure 2 for Asymptotic in a class of network models with sub-Gamma perturbations
Figure 3 for Asymptotic in a class of network models with sub-Gamma perturbations
Figure 4 for Asymptotic in a class of network models with sub-Gamma perturbations

For the differential privacy under the sub-Gamma noise, we derive the asymptotic properties of a class of network models with binary values with a general link function. In this paper, we release the degree sequences of the binary networks under a general noisy mechanism with the discrete Laplace mechanism as a special case. We establish the asymptotic result including both consistency and asymptotically normality of the parameter estimator when the number of parameters goes to infinity in a class of network models. Simulations and a real data example are provided to illustrate asymptotic results.

Viaarxiv icon

WakaVT: A Sequential Variational Transformer for Waka Generation

Apr 01, 2021
Yuka Takeishi, Mingxuan Niu, Jing Luo, Zhong Jin, Xinyu Yang

Figure 1 for WakaVT: A Sequential Variational Transformer for Waka Generation
Figure 2 for WakaVT: A Sequential Variational Transformer for Waka Generation
Figure 3 for WakaVT: A Sequential Variational Transformer for Waka Generation
Figure 4 for WakaVT: A Sequential Variational Transformer for Waka Generation

Poetry generation has long been a challenge for artificial intelligence. In the scope of Japanese poetry generation, many researchers have paid attention to Haiku generation, but few have focused on Waka generation. To further explore the creative potential of natural language generation systems in Japanese poetry creation, we propose a novel Waka generation model, WakaVT, which automatically produces Waka poems given user-specified keywords. Firstly, an additive mask-based approach is presented to satisfy the form constraint. Secondly, the structures of Transformer and variational autoencoder are integrated to enhance the quality of generated content. Specifically, to obtain novelty and diversity, WakaVT employs a sequence of latent variables, which effectively captures word-level variability in Waka data. To improve linguistic quality in terms of fluency, coherence, and meaningfulness, we further propose the fused multilevel self-attention mechanism, which properly models the hierarchical linguistic structure of Waka. To the best of our knowledge, we are the first to investigate Waka generation with models based on Transformer and/or variational autoencoder. Both objective and subjective evaluation results demonstrate that our model outperforms baselines significantly.

* This paper has been submitted to Neural Processing Letters 
Viaarxiv icon

A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions

Nov 13, 2020
Shulei Ji, Jing Luo, Xinyu Yang

Figure 1 for A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions
Figure 2 for A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions
Figure 3 for A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions
Figure 4 for A Comprehensive Survey on Deep Music Generation: Multi-level Representations, Algorithms, Evaluations, and Future Directions

The utilization of deep learning techniques in generating various contents (such as image, text, etc.) has become a trend. Especially music, the topic of this paper, has attracted widespread attention of countless researchers.The whole process of producing music can be divided into three stages, corresponding to the three levels of music generation: score generation produces scores, performance generation adds performance characteristics to the scores, and audio generation converts scores with performance characteristics into audio by assigning timbre or generates music in audio format directly. Previous surveys have explored the network models employed in the field of automatic music generation. However, the development history, the model evolution, as well as the pros and cons of same music generation task have not been clearly illustrated. This paper attempts to provide an overview of various composition tasks under different music generation levels, covering most of the currently popular music generation tasks using deep learning. In addition, we summarize the datasets suitable for diverse tasks, discuss the music representations, the evaluation methods as well as the challenges under different levels, and finally point out several future directions.

* 96 pages,this is a draft 
Viaarxiv icon

Flow Rate Control in Smart District Heating Systems Using Deep Reinforcement Learning

Dec 01, 2019
Tinghao Zhang, Jing Luo, Ping Chen, Jie Liu

Figure 1 for Flow Rate Control in Smart District Heating Systems Using Deep Reinforcement Learning
Figure 2 for Flow Rate Control in Smart District Heating Systems Using Deep Reinforcement Learning
Figure 3 for Flow Rate Control in Smart District Heating Systems Using Deep Reinforcement Learning
Figure 4 for Flow Rate Control in Smart District Heating Systems Using Deep Reinforcement Learning

At high latitudes, many cities adopt a centralized heating system to improve the energy generation efficiency and to reduce pollution. In multi-tier systems, so-called district heating, there are a few efficient approaches for the flow rate control during the heating process. In this paper, we describe the theoretical methods to solve this problem by deep reinforcement learning and propose a cloud-based heating control system for implementation. A real-world case study shows the effectiveness and practicability of the proposed system controlled by humans, and the simulated experiments for deep reinforcement learning show about 1985.01 gigajoules of heat quantity and 42276.45 tons of water are saved per hour compared with manual control.

* Submitted to Information Processing in Sensor Networks (IPSN 2020) 
Viaarxiv icon

MG-VAE: Deep Chinese Folk Songs Generation with Specific Regional Style

Sep 29, 2019
Jing Luo, Xinyu Yang, Shulei Ji, Juan Li

Figure 1 for MG-VAE: Deep Chinese Folk Songs Generation with Specific Regional Style
Figure 2 for MG-VAE: Deep Chinese Folk Songs Generation with Specific Regional Style
Figure 3 for MG-VAE: Deep Chinese Folk Songs Generation with Specific Regional Style
Figure 4 for MG-VAE: Deep Chinese Folk Songs Generation with Specific Regional Style

Regional style in Chinese folk songs is a rich treasure that can be used for ethnic music creation and folk culture research. In this paper, we propose MG-VAE, a music generative model based on VAE (Variational Auto-Encoder) that is capable of capturing specific music style and generating novel tunes for Chinese folk songs (Min Ge) in a manipulatable way. Specifically, we disentangle the latent space of VAE into four parts in an adversarial training way to control the information of pitch and rhythm sequence, as well as of music style and content. In detail, two classifiers are used to separate style and content latent space, and temporal supervision is utilized to disentangle the pitch and rhythm sequence. The experimental results show that the disentanglement is successful and our model is able to create novel folk songs with controllable regional styles. To our best knowledge, this is the first study on applying deep generative model and adversarial training for Chinese music generation.

* Accepted by the 7th Conference on Sound and Music Technology, 2019, Harbin, China 
Viaarxiv icon