Alert button
Picture for Xiangyu Wang

Xiangyu Wang

Alert button

School of Artificial Intelligence, Nanjing University

Causal Structure Learning Supervised by Large Language Model

Nov 20, 2023
Taiyu Ban, Lyuzhou Chen, Derui Lyu, Xiangyu Wang, Huanhuan Chen

Causal discovery from observational data is pivotal for deciphering complex relationships. Causal Structure Learning (CSL), which focuses on deriving causal Directed Acyclic Graphs (DAGs) from data, faces challenges due to vast DAG spaces and data sparsity. The integration of Large Language Models (LLMs), recognized for their causal reasoning capabilities, offers a promising direction to enhance CSL by infusing it with knowledge-based causal inferences. However, existing approaches utilizing LLMs for CSL have encountered issues, including unreliable constraints from imperfect LLM inferences and the computational intensity of full pairwise variable analyses. In response, we introduce the Iterative LLM Supervised CSL (ILS-CSL) framework. ILS-CSL innovatively integrates LLM-based causal inference with CSL in an iterative process, refining the causal DAG using feedback from LLMs. This method not only utilizes LLM resources more efficiently but also generates more robust and high-quality structural constraints compared to previous methodologies. Our comprehensive evaluation across eight real-world datasets demonstrates ILS-CSL's superior performance, setting a new standard in CSL efficacy and showcasing its potential to significantly advance the field of causal discovery. The codes are available at \url{https://github.com/tyMadara/ILS-CSL}.

Viaarxiv icon

Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields

Jul 27, 2023
Xiangyu Wang, Jingsen Zhu, Qi Ye, Yuchi Huo, Yunlong Ran, Zhihua Zhong, Jiming Chen

Figure 1 for Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields
Figure 2 for Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields
Figure 3 for Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields
Figure 4 for Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields

With the popularity of implicit neural representations, or neural radiance fields (NeRF), there is a pressing need for editing methods to interact with the implicit 3D models for tasks like post-processing reconstructed scenes and 3D content creation. While previous works have explored NeRF editing from various perspectives, they are restricted in editing flexibility, quality, and speed, failing to offer direct editing response and instant preview. The key challenge is to conceive a locally editable neural representation that can directly reflect the editing instructions and update instantly. To bridge the gap, we propose a new interactive editing method and system for implicit representations, called Seal-3D, which allows users to edit NeRF models in a pixel-level and free manner with a wide range of NeRF-like backbone and preview the editing effects instantly. To achieve the effects, the challenges are addressed by our proposed proxy function mapping the editing instructions to the original space of NeRF models and a teacher-student training strategy with local pretraining and global finetuning. A NeRF editing system is built to showcase various editing types. Our system can achieve compelling editing effects with an interactive speed of about 1 second.

* Accepted by ICCV2023. Project Page: https://windingwind.github.io/seal-3d/ Code: https://github.com/windingwind/seal-3d/ 
Viaarxiv icon

From Query Tools to Causal Architects: Harnessing Large Language Models for Advanced Causal Discovery from Data

Jun 29, 2023
Taiyu Ban, Lyvzhou Chen, Xiangyu Wang, Huanhuan Chen

Figure 1 for From Query Tools to Causal Architects: Harnessing Large Language Models for Advanced Causal Discovery from Data
Figure 2 for From Query Tools to Causal Architects: Harnessing Large Language Models for Advanced Causal Discovery from Data
Figure 3 for From Query Tools to Causal Architects: Harnessing Large Language Models for Advanced Causal Discovery from Data
Figure 4 for From Query Tools to Causal Architects: Harnessing Large Language Models for Advanced Causal Discovery from Data

Large Language Models (LLMs) exhibit exceptional abilities for causal analysis between concepts in numerous societally impactful domains, including medicine, science, and law. Recent research on LLM performance in various causal discovery and inference tasks has given rise to a new ladder in the classical three-stage framework of causality. In this paper, we advance the current research of LLM-driven causal discovery by proposing a novel framework that combines knowledge-based LLM causal analysis with data-driven causal structure learning. To make LLM more than a query tool and to leverage its power in discovering natural and new laws of causality, we integrate the valuable LLM expertise on existing causal mechanisms into statistical analysis of objective data to build a novel and practical baseline for causal structure learning. We introduce a universal set of prompts designed to extract causal graphs from given variables and assess the influence of LLM prior causality on recovering causal structures from data. We demonstrate the significant enhancement of LLM expertise on the quality of recovered causal structures from data, while also identifying critical challenges and issues, along with potential approaches to address them. As a pioneering study, this paper aims to emphasize the new frontier that LLMs are opening for classical causal discovery and inference, and to encourage the widespread adoption of LLM capabilities in data-driven causal analysis.

Viaarxiv icon

Mitigating Prior Errors in Causal Structure Learning: Towards LLM driven Prior Knowledge

Jun 12, 2023
Lyuzhou Chen, Taiyu Ban, Xiangyu Wang, Derui Lyu, Huanhuan Chen

Figure 1 for Mitigating Prior Errors in Causal Structure Learning: Towards LLM driven Prior Knowledge
Figure 2 for Mitigating Prior Errors in Causal Structure Learning: Towards LLM driven Prior Knowledge
Figure 3 for Mitigating Prior Errors in Causal Structure Learning: Towards LLM driven Prior Knowledge
Figure 4 for Mitigating Prior Errors in Causal Structure Learning: Towards LLM driven Prior Knowledge

Causal structure learning, a prominent technique for encoding cause and effect relationships among variables, through Bayesian Networks (BNs). Merely recovering causal structures from real-world observed data lacks precision, while the development of Large Language Models (LLM) is opening a new frontier of causality. LLM presents strong capability in discovering causal relationships between variables with the "text" inputs defining the investigated variables, leading to a potential new hierarchy and new ladder of causality. We aim an critical issue in the emerging topic of LLM based causal structure learning, to tackle erroneous prior causal statements from LLM, which is seldom considered in the current context of expert dominating prior resources. As a pioneer attempt, we propose a BN learning strategy resilient to prior errors without need of human intervention. Focusing on the edge-level prior, we classify the possible prior errors into three types: order-consistent, order-reversed, and irrelevant, and provide their theoretical impact on the Structural Hamming Distance (SHD) under the presumption of sufficient data. Intriguingly, we discover and prove that only the order-reversed error contributes to an increase in a unique acyclic closed structure, defined as a "quasi-circle". Leveraging this insight, a post-hoc strategy is employed to identify the order-reversed prior error by its impact on the increment of "quasi-circles". Through empirical evaluation on both real and synthetic datasets, we demonstrate our strategy's robustness against prior errors. Specifically, we highlight its substantial ability to resist order-reversed errors while maintaining the majority of correct prior knowledge.

* 14 pages, 4 figures 
Viaarxiv icon

GIMM: InfoMin-Max for Automated Graph Contrastive Learning

May 27, 2023
Xin Xiong, Furao Shen, Xiangyu Wang, Jian Zhao

Figure 1 for GIMM: InfoMin-Max for Automated Graph Contrastive Learning
Figure 2 for GIMM: InfoMin-Max for Automated Graph Contrastive Learning
Figure 3 for GIMM: InfoMin-Max for Automated Graph Contrastive Learning
Figure 4 for GIMM: InfoMin-Max for Automated Graph Contrastive Learning

Graph contrastive learning (GCL) shows great potential in unsupervised graph representation learning. Data augmentation plays a vital role in GCL, and its optimal choice heavily depends on the downstream task. Many GCL methods with automated data augmentation face the risk of insufficient information as they fail to preserve the essential information necessary for the downstream task. To solve this problem, we propose InfoMin-Max for automated Graph contrastive learning (GIMM), which prevents GCL from encoding redundant information and losing essential information. GIMM consists of two major modules: (1) automated graph view generator, which acquires the approximation of InfoMin's optimal views through adversarial training without requiring task-relevant information; (2) view comparison, which learns an excellent encoder by applying InfoMax to view representations. To the best of our knowledge, GIMM is the first method that combines the InfoMin and InfoMax principles in GCL. Besides, GIMM introduces randomness to augmentation, thus stabilizing the model against perturbations. Extensive experiments on unsupervised and semi-supervised learning for node and graph classification demonstrate the superiority of our GIMM over state-of-the-art GCL methods with automated and manual data augmentation.

Viaarxiv icon

Value of Exploration: Measurements, Findings and Algorithms

May 12, 2023
Yi Su, Xiangyu Wang, Elaine Ya Le, Liang Liu, Yuening Li, Haokai Lu, Benjamin Lipshitz, Sriraj Badam, Lukasz Heldt, Shuchao Bi, Ed Chi, Cristos Goodrow, Su-Lin Wu, Lexi Baugher, Minmin Chen

Figure 1 for Value of Exploration: Measurements, Findings and Algorithms
Figure 2 for Value of Exploration: Measurements, Findings and Algorithms
Figure 3 for Value of Exploration: Measurements, Findings and Algorithms
Figure 4 for Value of Exploration: Measurements, Findings and Algorithms

Effective exploration is believed to positively influence the long-term user experience on recommendation platforms. Determining its exact benefits, however, has been challenging. Regular A/B tests on exploration often measure neutral or even negative engagement metrics while failing to capture its long-term benefits. To address this, we present a systematic study to formally quantify the value of exploration by examining its effects on the content corpus, a key entity in the recommender system that directly affects user experiences. Specifically, we introduce new metrics and the associated experiment design to measure the benefit of exploration on the corpus change, and further connect the corpus change to the long-term user experience. Furthermore, we investigate the possibility of introducing the Neural Linear Bandit algorithm to build an exploration-based ranking system, and use it as the backbone algorithm for our case study. We conduct extensive live experiments on a large-scale commercial recommendation platform that serves billions of users to validate the new experiment designs, quantify the long-term values of exploration, and to verify the effectiveness of the adopted neural linear bandit algorithm for exploration.

* 19 pages 
Viaarxiv icon

A Graph Neural Network with Negative Message Passing for Graph Coloring

Jan 26, 2023
Xiangyu Wang, Xueming Yan, Yaochu Jin

Figure 1 for A Graph Neural Network with Negative Message Passing for Graph Coloring
Figure 2 for A Graph Neural Network with Negative Message Passing for Graph Coloring
Figure 3 for A Graph Neural Network with Negative Message Passing for Graph Coloring
Figure 4 for A Graph Neural Network with Negative Message Passing for Graph Coloring

Graph neural networks have received increased attention over the past years due to their promising ability to handle graph-structured data, which can be found in many real-world problems such as recommended systems and drug synthesis. Most existing research focuses on using graph neural networks to solve homophilous problems, but little attention has been paid to heterophily-type problems. In this paper, we propose a graph network model for graph coloring, which is a class of representative heterophilous problems. Different from the conventional graph networks, we introduce negative message passing into the proposed graph neural network for more effective information exchange in handling graph coloring problems. Moreover, a new loss function taking into account the self-information of the nodes is suggested to accelerate the learning process. Experimental studies are carried out to compare the proposed graph model with five state-of-the-art algorithms on ten publicly available graph coloring problems and one real-world application. Numerical results demonstrate the effectiveness of the proposed graph neural network.

Viaarxiv icon

ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions

Oct 04, 2022
Anjun Chen, Xiangyu Wang, Kun Shi, Shaohao Zhu, Yingfeng Chen, Bin Fang, Jiming Chen, Yuchi Huo, Qi Ye

Figure 1 for ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions
Figure 2 for ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions
Figure 3 for ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions
Figure 4 for ImmFusion: Robust mmWave-RGB Fusion for 3D Human Body Reconstruction in All Weather Conditions

3D human reconstruction from RGB images achieves decent results in good weather conditions but degrades dramatically in rough weather. Complementary, mmWave radars have been employed to reconstruct 3D human joints and meshes in rough weather. However, combining RGB and mmWave signals for robust all-weather 3D human reconstruction is still an open challenge, given the sparse nature of mmWave and the vulnerability of RGB images. In this paper, we present ImmFusion, the first mmWave-RGB fusion solution to reconstruct 3D human bodies in all weather conditions robustly. Specifically, our ImmFusion consists of image and point backbones for token feature extraction and a Transformer module for token fusion. The image and point backbones refine global and local features from original data, and the Fusion Transformer Module aims for effective information fusion of two modalities by dynamically selecting informative tokens. Extensive experiments on a large-scale dataset, mmBody, captured in various environments demonstrate that ImmFusion can efficiently utilize the information of two modalities to achieve a robust 3D human body reconstruction in all weather conditions. In addition, our method's accuracy is significantly superior to that of state-of-the-art Transformer-based LiDAR-camera fusion methods.

Viaarxiv icon

mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar

Sep 12, 2022
Anjun Chen, Xiangyu Wang, Shaohao Zhu, Yanxu Li, Jiming Chen, Qi Ye

Figure 1 for mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar
Figure 2 for mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar
Figure 3 for mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar
Figure 4 for mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar

Millimeter Wave (mmWave) Radar is gaining popularity as it can work in adverse environments like smoke, rain, snow, poor lighting, etc. Prior work has explored the possibility of reconstructing 3D skeletons or meshes from the noisy and sparse mmWave Radar signals. However, it is unclear how accurately we can reconstruct the 3D body from the mmWave signals across scenes and how it performs compared with cameras, which are important aspects needed to be considered when either using mmWave radars alone or combining them with cameras. To answer these questions, an automatic 3D body annotation system is first designed and built up with multiple sensors to collect a large-scale dataset. The dataset consists of synchronized and calibrated mmWave radar point clouds and RGB(D) images in different scenes and skeleton/mesh annotations for humans in the scenes. With this dataset, we train state-of-the-art methods with inputs from different sensors and test them in various scenarios. The results demonstrate that 1) despite the noise and sparsity of the generated point clouds, the mmWave radar can achieve better reconstruction accuracy than the RGB camera but worse than the depth camera; 2) the reconstruction from the mmWave radar is affected by adverse weather conditions moderately while the RGB(D) camera is severely affected. Further, analysis of the dataset and the results shadow insights on improving the reconstruction from the mmWave radar and the combination of signals from different sensors.

* Multimedia 2022, 10 pages, 11 figures 
Viaarxiv icon

SE-Harris and eSUSAN: Asynchronous Event-Based Corner Detection Using Megapixel Resolution CeleX-V Camera

May 02, 2021
Jinjian Li, Chuandong Guo, Li Su, Xiangyu Wang, Quan Hu

Figure 1 for SE-Harris and eSUSAN: Asynchronous Event-Based Corner Detection Using Megapixel Resolution CeleX-V Camera
Figure 2 for SE-Harris and eSUSAN: Asynchronous Event-Based Corner Detection Using Megapixel Resolution CeleX-V Camera
Figure 3 for SE-Harris and eSUSAN: Asynchronous Event-Based Corner Detection Using Megapixel Resolution CeleX-V Camera
Figure 4 for SE-Harris and eSUSAN: Asynchronous Event-Based Corner Detection Using Megapixel Resolution CeleX-V Camera

Event cameras are novel neuromorphic vision sensors with ultrahigh temporal resolution and low latency, both in the order of microseconds. Instead of image frames, event cameras generate an asynchronous event stream of per-pixel intensity changes with precise timestamps. The resulting sparse data structure impedes applying many conventional computer vision techniques to event streams, and specific algorithms should be designed to leverage the information provided by event cameras. We propose a corner detection algorithm, eSUSAN, inspired by the conventional SUSAN (smallest univalue segment assimilating nucleus) algorithm for corner detection. The proposed eSUSAN extracts the univalue segment assimilating nucleus from the circle kernel based on the similarity across timestamps and distinguishes corner events by the number of pixels in the nucleus area. Moreover, eSUSAN is fast enough to be applied to CeleX-V, the event camera with the highest resolution available. Based on eSUSAN, we also propose the SE-Harris corner detector, which uses adaptive normalization based on exponential decay to quickly construct a local surface of active events and the event-based Harris detector to refine the corners identified by eSUSAN. We evaluated the proposed algorithms on a public dataset and CeleX-V data. Both eSUSAN and SE-Harris exhibit higher real-time performance than existing algorithms while maintaining high accuracy and tracking performance.

* 10 pages, 8 figures 
Viaarxiv icon