Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chao Chen

Robotics Institute, University of Michigan, Ann Arbor

Gradient-based Fuzzy System Optimisation via Automatic Differentiation -- FuzzyR as a Use Case

Mar 18, 2024

Chao Chen, Christian Wagner, Jonathan M. Garibaldi

Figure 1 for Gradient-based Fuzzy System Optimisation via Automatic Differentiation -- FuzzyR as a Use Case

Figure 2 for Gradient-based Fuzzy System Optimisation via Automatic Differentiation -- FuzzyR as a Use Case

Figure 3 for Gradient-based Fuzzy System Optimisation via Automatic Differentiation -- FuzzyR as a Use Case

Figure 4 for Gradient-based Fuzzy System Optimisation via Automatic Differentiation -- FuzzyR as a Use Case

Abstract:Since their introduction, fuzzy sets and systems have become an important area of research known for its versatility in modelling, knowledge representation and reasoning, and increasingly its potential within the context explainable AI. While the applications of fuzzy systems are diverse, there has been comparatively little advancement in their design from a machine learning perspective. In other words, while representations such as neural networks have benefited from a boom in learning capability driven by an increase in computational performance in combination with advances in their training mechanisms and available tool, in particular gradient descent, the impact on fuzzy system design has been limited. In this paper, we discuss gradient-descent-based optimisation of fuzzy systems, focussing in particular on automatic differentiation -- crucial to neural network learning -- with a view to free fuzzy system designers from intricate derivative computations, allowing for more focus on the functional and explainability aspects of their design. As a starting point, we present a use case in FuzzyR which demonstrates how current fuzzy inference system implementations can be adjusted to leverage powerful features of automatic differentiation tools sets, discussing its potential for the future of fuzzy system design.

Via

Access Paper or Ask Questions

An Iterative Associative Memory Model for Empathetic Response Generation

Feb 28, 2024

Zhou Yang, Zhaochun Ren, Yufeng Wang, Chao Chen, Haizhou Sun, Xiaofei Zhu, Xiangwen Liao

Figure 1 for An Iterative Associative Memory Model for Empathetic Response Generation

Figure 2 for An Iterative Associative Memory Model for Empathetic Response Generation

Figure 3 for An Iterative Associative Memory Model for Empathetic Response Generation

Figure 4 for An Iterative Associative Memory Model for Empathetic Response Generation

Abstract:Empathetic response generation is to comprehend the cognitive and emotional states in dialogue utterances and generate proper responses. Psychological theories posit that comprehending emotional and cognitive states necessitates iteratively capturing and understanding associated words across dialogue utterances. However, existing approaches regard dialogue utterances as either a long sequence or independent utterances for comprehension, which are prone to overlook the associated words between them. To address this issue, we propose an Iterative Associative Memory Model (IAMM) for empathetic response generation. Specifically, we employ a novel second-order interaction attention mechanism to iteratively capture vital associated words between dialogue utterances and situations, dialogue history, and a memory module (for storing associated words), thereby accurately and nuancedly comprehending the utterances. We conduct experiments on the Empathetic-Dialogue dataset. Both automatic and human evaluations validate the efficacy of the model. Meanwhile, variant experiments on LLMs also demonstrate that attending to associated words improves empathetic comprehension and expression.

* 12 pages, 4 figures

Via

Access Paper or Ask Questions

Topological Analysis of Mouse Brain Vasculature via 3D Light-sheet Microscopy Images

Feb 23, 2024

Jiachen Yao, Nina Hagemann, Qiaojie Xiong, Jianxu Chen, Dirk M. Hermann, Chao Chen

Abstract:Vascular networks play a crucial role in understanding brain functionalities. Brain integrity and function, neuronal activity and plasticity, which are crucial for learning, are actively modulated by their local environments, specifically vascular networks. With recent developments in high-resolution 3D light-sheet microscopy imaging together with tissue processing techniques, it becomes feasible to obtain and examine large-scale brain vasculature in mice. To establish a structural foundation for functional study, however, we need advanced image analysis and structural modeling methods. Existing works use geometric features such as thickness, tortuosity, etc. However, geometric features cannot fully capture structural characteristics such as the richness of branches, connectivity, etc. In this paper, we study the morphology of brain vasculature through a topological lens. We extract topological features based on the theory of topological data analysis. Comparing of these robust and multi-scale topological structural features across different brain anatomical structures and between normal and obese populations sheds light on their promising future in studying neurological diseases.

Via

Access Paper or Ask Questions

Pheno-Robot: An Auto-Digital Modelling System for In-Situ Phenotyping in the Field

Feb 15, 2024

Yaoqiang Pan, Kewei Hu, Tianhao Liu, Chao Chen, Hanwen Kang

Figure 1 for Pheno-Robot: An Auto-Digital Modelling System for In-Situ Phenotyping in the Field

Figure 2 for Pheno-Robot: An Auto-Digital Modelling System for In-Situ Phenotyping in the Field

Figure 3 for Pheno-Robot: An Auto-Digital Modelling System for In-Situ Phenotyping in the Field

Figure 4 for Pheno-Robot: An Auto-Digital Modelling System for In-Situ Phenotyping in the Field

Abstract:Accurate reconstruction of plant models for phenotyping analysis is critical for optimising sustainable agricultural practices in precision agriculture. Traditional laboratory-based phenotyping, while valuable, falls short of understanding how plants grow under uncontrolled conditions. Robotic technologies offer a promising avenue for large-scale, direct phenotyping in real-world environments. This study explores the deployment of emerging robotics and digital technology in plant phenotyping to improve performance and efficiency. Three critical functional modules: environmental understanding, robotic motion planning, and in-situ phenotyping, are introduced to automate the entire process. Experimental results demonstrate the effectiveness of the system in agricultural environments. The pheno-robot system autonomously collects high-quality data by navigating around plants. In addition, the in-situ modelling model reconstructs high-quality plant models from the data collected by the robot. The developed robotic system shows high efficiency and robustness, demonstrating its potential to advance plant science in real-world agricultural environments.

Via

Access Paper or Ask Questions

FusionSF: Fuse Heterogeneous Modalities in a Vector Quantized Framework for Robust Solar Power Forecasting

Feb 08, 2024

Ziqing Ma, Wenwei Wang, Tian Zhou, Chao Chen, Bingqing Peng, Liang Sun, Rong Jin

Figure 1 for FusionSF: Fuse Heterogeneous Modalities in a Vector Quantized Framework for Robust Solar Power Forecasting

Figure 2 for FusionSF: Fuse Heterogeneous Modalities in a Vector Quantized Framework for Robust Solar Power Forecasting

Figure 3 for FusionSF: Fuse Heterogeneous Modalities in a Vector Quantized Framework for Robust Solar Power Forecasting

Figure 4 for FusionSF: Fuse Heterogeneous Modalities in a Vector Quantized Framework for Robust Solar Power Forecasting

Abstract:Accurate solar power forecasting is crucial to integrate photovoltaic plants into the electric grid, schedule and secure the power grid safety. This problem becomes more demanding for those newly installed solar plants which lack sufficient data. Current research predominantly relies on historical solar power data or numerical weather prediction in a single-modality format, ignoring the complementary information provided in different modalities. In this paper, we propose a multi-modality fusion framework to integrate historical power data, numerical weather prediction, and satellite images, significantly improving forecast performance. We introduce a vector quantized framework that aligns modalities with varying information densities, striking a balance between integrating sufficient information and averting model overfitting. Our framework demonstrates strong zero-shot forecasting capability, which is especially useful for those newly installed plants. Moreover, we collect and release a multi-modal solar power (MMSP) dataset from real-world plants to further promote the research of multi-modal solar forecasting algorithms. Our extensive experiments show that our model not only operates with robustness but also boosts accuracy in both zero-shot forecasting and scenarios rich with training data, surpassing leading models. We have incorporated it into our eForecaster platform and deployed it for more than 300 solar plants with a capacity of over 15GW.

Via

Access Paper or Ask Questions

Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization for Enhanced Time Series Forecasting

Feb 08, 2024

Yanjun Zhao, Tian Zhou, Chao Chen, Liang Sun, Yi Qian, Rong Jin

Figure 1 for Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization for Enhanced Time Series Forecasting

Figure 2 for Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization for Enhanced Time Series Forecasting

Figure 3 for Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization for Enhanced Time Series Forecasting

Figure 4 for Sparse-VQ Transformer: An FFN-Free Framework with Vector Quantization for Enhanced Time Series Forecasting

Abstract:Time series analysis is vital for numerous applications, and transformers have become increasingly prominent in this domain. Leading methods customize the transformer architecture from NLP and CV, utilizing a patching technique to convert continuous signals into segments. Yet, time series data are uniquely challenging due to significant distribution shifts and intrinsic noise levels. To address these two challenges,we introduce the Sparse Vector Quantized FFN-Free Transformer (Sparse-VQ). Our methodology capitalizes on a sparse vector quantization technique coupled with Reverse Instance Normalization (RevIN) to reduce noise impact and capture sufficient statistics for forecasting, serving as an alternative to the Feed-Forward layer (FFN) in the transformer architecture. Our FFN-free approach trims the parameter count, enhancing computational efficiency and reducing overfitting. Through evaluations across ten benchmark datasets, including the newly introduced CAISO dataset, Sparse-VQ surpasses leading models with a 7.84% and 4.17% decrease in MAE for univariate and multivariate time series forecasting, respectively. Moreover, it can be seamlessly integrated with existing transformer-based models to elevate their performance.

Via

Access Paper or Ask Questions

An invariance constrained deep learning network for PDE discovery

Feb 06, 2024

Chao Chen, Hui Li, Xiaowei Jin

Abstract:The discovery of partial differential equations (PDEs) from datasets has attracted increased attention. However, the discovery of governing equations from sparse data with high noise is still very challenging due to the difficulty of derivatives computation and the disturbance of noise. Moreover, the selection principles for the candidate library to meet physical laws need to be further studied. The invariance is one of the fundamental laws for governing equations. In this study, we propose an invariance constrained deep learning network (ICNet) for the discovery of PDEs. Considering that temporal and spatial translation invariance (Galilean invariance) is a fundamental property of physical laws, we filter the candidates that cannot meet the requirement of the Galilean transformations. Subsequently, we embedded the fixed and possible terms into the loss function of neural network, significantly countering the effect of sparse data with high noise. Then, by filtering out redundant terms without fixing learnable parameters during the training process, the governing equations discovered by the ICNet method can effectively approximate the real governing equations. We select the 2D Burgers equation, the equation of 2D channel flow over an obstacle, and the equation of 3D intracranial aneurysm as examples to verify the superiority of the ICNet for fluid mechanics. Furthermore, we extend similar invariance methods to the discovery of wave equation (Lorentz Invariance) and verify it through Single and Coupled Klein-Gordon equation. The results show that the ICNet method with physical constraints exhibits excellent performance in governing equations discovery from sparse and noisy data.

Via

Access Paper or Ask Questions

INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection

Feb 06, 2024

Chao Chen, Kai Liu, Ze Chen, Yi Gu, Yue Wu, Mingyuan Tao, Zhihang Fu, Jieping Ye

Abstract:Knowledge hallucination have raised widespread concerns for the security and reliability of deployed LLMs. Previous efforts in detecting hallucinations have been employed at logit-level uncertainty estimation or language-level self-consistency evaluation, where the semantic information is inevitably lost during the token-decoding procedure. Thus, we propose to explore the dense semantic information retained within LLMs' \textbf{IN}ternal \textbf{S}tates for halluc\textbf{I}nation \textbf{DE}tection (\textbf{INSIDE}). In particular, a simple yet effective \textbf{EigenScore} metric is proposed to better evaluate responses' self-consistency, which exploits the eigenvalues of responses' covariance matrix to measure the semantic consistency/diversity in the dense embedding space. Furthermore, from the perspective of self-consistent hallucination detection, a test time feature clipping approach is explored to truncate extreme activations in the internal states, which reduces overconfident generations and potentially benefits the detection of overconfident hallucinations. Extensive experiments and ablation studies are performed on several popular LLMs and question-answering (QA) benchmarks, showing the effectiveness of our proposal.

* Accepted by ICLR-2024

Via

Access Paper or Ask Questions

Optimal Parameter and Neuron Pruning for Out-of-Distribution Detection

Feb 04, 2024

Chao Chen, Zhihang Fu, Kai Liu, Ze Chen, Mingyuan Tao, Jieping Ye

Abstract:For a machine learning model deployed in real world scenarios, the ability of detecting out-of-distribution (OOD) samples is indispensable and challenging. Most existing OOD detection methods focused on exploring advanced training skills or training-free tricks to prevent the model from yielding overconfident confidence score for unknown samples. The training-based methods require expensive training cost and rely on OOD samples which are not always available, while most training-free methods can not efficiently utilize the prior information from the training data. In this work, we propose an \textbf{O}ptimal \textbf{P}arameter and \textbf{N}euron \textbf{P}runing (\textbf{OPNP}) approach, which aims to identify and remove those parameters and neurons that lead to over-fitting. The main method is divided into two steps. In the first step, we evaluate the sensitivity of the model parameters and neurons by averaging gradients over all training samples. In the second step, the parameters and neurons with exceptionally large or close to zero sensitivities are removed for prediction. Our proposal is training-free, compatible with other post-hoc methods, and exploring the information from all training data. Extensive experiments are performed on multiple OOD detection tasks and model architectures, showing that our proposed OPNP consistently outperforms the existing methods by a large margin.

* NeurIPS 2023
* Accepted by NeurIPS 2023. 19 pages

Via

Access Paper or Ask Questions

Sketch and Refine: Towards Fast and Accurate Lane Detection

Jan 26, 2024

Chao Chen, Jie Liu, Chang Zhou, Jie Tang, Gangshan Wu

Figure 1 for Sketch and Refine: Towards Fast and Accurate Lane Detection

Figure 2 for Sketch and Refine: Towards Fast and Accurate Lane Detection

Figure 3 for Sketch and Refine: Towards Fast and Accurate Lane Detection

Figure 4 for Sketch and Refine: Towards Fast and Accurate Lane Detection

Abstract:Lane detection is to determine the precise location and shape of lanes on the road. Despite efforts made by current methods, it remains a challenging task due to the complexity of real-world scenarios. Existing approaches, whether proposal-based or keypoint-based, suffer from depicting lanes effectively and efficiently. Proposal-based methods detect lanes by distinguishing and regressing a collection of proposals in a streamlined top-down way, yet lack sufficient flexibility in lane representation. Keypoint-based methods, on the other hand, construct lanes flexibly from local descriptors, which typically entail complicated post-processing. In this paper, we present a "Sketch-and-Refine" paradigm that utilizes the merits of both keypoint-based and proposal-based methods. The motivation is that local directions of lanes are semantically simple and clear. At the "Sketch" stage, local directions of keypoints can be easily estimated by fast convolutional layers. Then we can build a set of lane proposals accordingly with moderate accuracy. At the "Refine" stage, we further optimize these proposals via a novel Lane Segment Association Module (LSAM), which allows adaptive lane segment adjustment. Last but not least, we propose multi-level feature integration to enrich lane feature representations more efficiently. Based on the proposed "Sketch and Refine" paradigm, we propose a fast yet effective lane detector dubbed "SRLane". Experiments show that our SRLane can run at a fast speed (i.e., 278 FPS) while yielding an F1 score of 78.9\%. The source code is available at: https://github.com/passerer/SRLane.

Via

Access Paper or Ask Questions