Alert button
Picture for Jian Liu

Jian Liu

Alert button

Planning Reliability Assurance Tests for Autonomous Vehicles

Nov 30, 2023
Simin Zheng, Lu Lu, Yili Hong, Jian Liu

Artificial intelligence (AI) technology has become increasingly prevalent and transforms our everyday life. One important application of AI technology is the development of autonomous vehicles (AV). However, the reliability of an AV needs to be carefully demonstrated via an assurance test so that the product can be used with confidence in the field. To plan for an assurance test, one needs to determine how many AVs need to be tested for how many miles and the standard for passing the test. Existing research has made great efforts in developing reliability demonstration tests in the other fields of applications for product development and assessment. However, statistical methods have not been utilized in AV test planning. This paper aims to fill in this gap by developing statistical methods for planning AV reliability assurance tests based on recurrent events data. We explore the relationship between multiple criteria of interest in the context of planning AV reliability assurance tests. Specifically, we develop two test planning strategies based on homogeneous and non-homogeneous Poisson processes while balancing multiple objectives with the Pareto front approach. We also offer recommendations for practical use. The disengagement events data from the California Department of Motor Vehicles AV testing program is used to illustrate the proposed assurance test planning methods.

* 29 pages, 5 figures 
Viaarxiv icon

A Quality-based Syntactic Template Retriever for Syntactically-controlled Paraphrase Generation

Oct 20, 2023
Xue Zhang, Songming Zhang, Yunlong Liang, Yufeng Chen, Jian Liu, Wenjuan Han, Jinan Xu

Figure 1 for A Quality-based Syntactic Template Retriever for Syntactically-controlled Paraphrase Generation
Figure 2 for A Quality-based Syntactic Template Retriever for Syntactically-controlled Paraphrase Generation
Figure 3 for A Quality-based Syntactic Template Retriever for Syntactically-controlled Paraphrase Generation
Figure 4 for A Quality-based Syntactic Template Retriever for Syntactically-controlled Paraphrase Generation

Existing syntactically-controlled paraphrase generation (SPG) models perform promisingly with human-annotated or well-chosen syntactic templates. However, the difficulty of obtaining such templates actually hinders the practical application of SPG models. For one thing, the prohibitive cost makes it unfeasible to manually design decent templates for every source sentence. For another, the templates automatically retrieved by current heuristic methods are usually unreliable for SPG models to generate qualified paraphrases. To escape this dilemma, we propose a novel Quality-based Syntactic Template Retriever (QSTR) to retrieve templates based on the quality of the to-be-generated paraphrases. Furthermore, for situations requiring multiple paraphrases for each source sentence, we design a Diverse Templates Search (DTS) algorithm, which can enhance the diversity between paraphrases without sacrificing quality. Experiments demonstrate that QSTR can significantly surpass existing retrieval methods in generating high-quality paraphrases and even perform comparably with human-annotated templates in terms of reference-free metrics. Additionally, human evaluation and the performance on downstream tasks using our generated paraphrases for data augmentation showcase the potential of our QSTR and DTS algorithm in practical scenarios.

* Accepted to EMNLP 2023 
Viaarxiv icon

ChatGPT for Computational Topology

Oct 19, 2023
Jian Liu, Li Shen, Guo-Wei Wei

Figure 1 for ChatGPT for Computational Topology
Figure 2 for ChatGPT for Computational Topology
Figure 3 for ChatGPT for Computational Topology
Figure 4 for ChatGPT for Computational Topology

ChatGPT represents a significant milestone in the field of artificial intelligence (AI), finding widespread applications across diverse domains. However, its effectiveness in mathematical contexts has been somewhat constrained by its susceptibility to conceptual errors. Concurrently, topological data analysis (TDA), a relatively new discipline, has garnered substantial interest in recent years. Nonetheless, the advancement of TDA is impeded by the limited understanding of computational algorithms and coding proficiency among theoreticians. This work endeavors to bridge the gap between theoretical topological concepts and their practical implementation in computational topology through the utilization of ChatGPT. We showcase how a pure theoretician, devoid of computational experience and coding skills, can effectively transform mathematical formulations and concepts into functional code for computational topology with the assistance of ChatGPT. Our strategy outlines a productive process wherein a mathematician trains ChatGPT on pure mathematical concepts, steers ChatGPT towards generating computational topology code, and subsequently validates the generated code using established examples. Our specific case studies encompass the computation of Betti numbers, Laplacian matrices, and Dirac matrices for simplicial complexes, as well as the persistence of various homologies and Laplacians. Furthermore, we explore the application of ChatGPT in computing recently developed topological theories for hypergraphs and digraphs. This work serves as an initial step towards effectively transforming pure mathematical theories into practical computational tools, with the ultimate goal of enabling real applications across diverse fields.

Viaarxiv icon

Label-efficient Segmentation via Affinity Propagation

Oct 17, 2023
Wentong Li, Yuqian Yuan, Song Wang, Wenyu Liu, Dongqi Tang, Jian Liu, Jianke Zhu, Lei Zhang

Figure 1 for Label-efficient Segmentation via Affinity Propagation
Figure 2 for Label-efficient Segmentation via Affinity Propagation
Figure 3 for Label-efficient Segmentation via Affinity Propagation
Figure 4 for Label-efficient Segmentation via Affinity Propagation

Weakly-supervised segmentation with label-efficient sparse annotations has attracted increasing research attention to reduce the cost of laborious pixel-wise labeling process, while the pairwise affinity modeling techniques play an essential role in this task. Most of the existing approaches focus on using the local appearance kernel to model the neighboring pairwise potentials. However, such a local operation fails to capture the long-range dependencies and ignores the topology of objects. In this work, we formulate the affinity modeling as an affinity propagation process, and propose a local and a global pairwise affinity terms to generate accurate soft pseudo labels. An efficient algorithm is also developed to reduce significantly the computational cost. The proposed approach can be conveniently plugged into existing segmentation networks. Experiments on three typical label-efficient segmentation tasks, i.e. box-supervised instance segmentation, point/scribble-supervised semantic segmentation and CLIP-guided semantic segmentation, demonstrate the superior performance of the proposed approach.

* NeurIPS2023 Acceptance. Project Page:https://LiWentomng.github.io/apro/. Code: https://github.com/CircleRadon/APro 
Viaarxiv icon

Equirectangular image construction method for standard CNNs for Semantic Segmentation

Oct 13, 2023
Haoqian Chen, Jian Liu, Minghe Li, Kaiwen Jiang, Ziheng Xu, Rencheng Sun, Yi Sui

Figure 1 for Equirectangular image construction method for standard CNNs for Semantic Segmentation
Figure 2 for Equirectangular image construction method for standard CNNs for Semantic Segmentation
Figure 3 for Equirectangular image construction method for standard CNNs for Semantic Segmentation
Figure 4 for Equirectangular image construction method for standard CNNs for Semantic Segmentation

360{\deg} spherical images have advantages of wide view field, and are typically projected on a planar plane for processing, which is known as equirectangular image. The object shape in equirectangular images can be distorted and lack translation invariance. In addition, there are few publicly dataset of equirectangular images with labels, which presents a challenge for standard CNNs models to process equirectangular images effectively. To tackle this problem, we propose a methodology for converting a perspective image into equirectangular image. The inverse transformation of the spherical center projection and the equidistant cylindrical projection are employed. This enables the standard CNNs to learn the distortion features at different positions in the equirectangular image and thereby gain the ability to semantically the equirectangular image. The parameter, {\phi}, which determines the projection position of the perspective image, has been analyzed using various datasets and models, such as UNet, UNet++, SegNet, PSPNet, and DeepLab v3+. The experiments demonstrate that an optimal value of {\phi} for effective semantic segmentation of equirectangular images is 6{\pi}/16 for standard CNNs. Compared with the other three types of methods (supervised learning, unsupervised learning and data augmentation), the method proposed in this paper has the best average IoU value of 43.76%. This value is 23.85%, 10.7% and 17.23% higher than those of other three methods, respectively.

Viaarxiv icon

GLoRE: Evaluating Logical Reasoning of Large Language Models

Oct 13, 2023
Hanmeng liu, Zhiyang Teng, Ruoxi Ning, Jian Liu, Qiji Zhou, Yue Zhang

Figure 1 for GLoRE: Evaluating Logical Reasoning of Large Language Models
Figure 2 for GLoRE: Evaluating Logical Reasoning of Large Language Models
Figure 3 for GLoRE: Evaluating Logical Reasoning of Large Language Models
Figure 4 for GLoRE: Evaluating Logical Reasoning of Large Language Models

Recently, large language models (LLMs), including notable models such as GPT-4 and burgeoning community models, have showcased significant general language understanding abilities. However, there has been a scarcity of attempts to assess the logical reasoning capacities of these LLMs, an essential facet of natural language understanding. To encourage further investigation in this area, we introduce GLoRE, a meticulously assembled General Logical Reasoning Evaluation benchmark comprised of 12 datasets that span three different types of tasks. Our experimental results show that compared to the performance of human and supervised fine-tuning, the logical reasoning capabilities of open LLM models necessitate additional improvement; ChatGPT and GPT-4 show a strong capability of logical reasoning, with GPT-4 surpassing ChatGPT by a large margin. We propose a self-consistency probing method to enhance the accuracy of ChatGPT and a fine-tuned method to boost the performance of an open LLM. We release the datasets and evaluation programs to facilitate future research.

Viaarxiv icon

AffordPose: A Large-scale Dataset of Hand-Object Interactions with Affordance-driven Hand Pose

Sep 16, 2023
Juntao Jian, Xiuping Liu, Manyi Li, Ruizhen Hu, Jian Liu

Figure 1 for AffordPose: A Large-scale Dataset of Hand-Object Interactions with Affordance-driven Hand Pose
Figure 2 for AffordPose: A Large-scale Dataset of Hand-Object Interactions with Affordance-driven Hand Pose
Figure 3 for AffordPose: A Large-scale Dataset of Hand-Object Interactions with Affordance-driven Hand Pose
Figure 4 for AffordPose: A Large-scale Dataset of Hand-Object Interactions with Affordance-driven Hand Pose

How human interact with objects depends on the functional roles of the target objects, which introduces the problem of affordance-aware hand-object interaction. It requires a large number of human demonstrations for the learning and understanding of plausible and appropriate hand-object interactions. In this work, we present AffordPose, a large-scale dataset of hand-object interactions with affordance-driven hand pose. We first annotate the specific part-level affordance labels for each object, e.g. twist, pull, handle-grasp, etc, instead of the general intents such as use or handover, to indicate the purpose and guide the localization of the hand-object interactions. The fine-grained hand-object interactions reveal the influence of hand-centered affordances on the detailed arrangement of the hand poses, yet also exhibit a certain degree of diversity. We collect a total of 26.7K hand-object interactions, each including the 3D object shape, the part-level affordance label, and the manually adjusted hand poses. The comprehensive data analysis shows the common characteristics and diversity of hand-object interactions per affordance via the parameter statistics and contacting computation. We also conduct experiments on the tasks of hand-object affordance understanding and affordance-oriented hand-object interaction generation, to validate the effectiveness of our dataset in learning the fine-grained hand-object interactions. Project page: https://github.com/GentlesJan/AffordPose.

* Accepted by ICCV 2023 
Viaarxiv icon

Expressive paragraph text-to-speech synthesis with multi-step variational autoencoder

Sep 02, 2023
Xuyuan Li, Zengqiang Shang, Jian Liu, Hua Hua, Peiyang Shi, Pengyuan Zhang

Figure 1 for Expressive paragraph text-to-speech synthesis with multi-step variational autoencoder
Figure 2 for Expressive paragraph text-to-speech synthesis with multi-step variational autoencoder
Figure 3 for Expressive paragraph text-to-speech synthesis with multi-step variational autoencoder
Figure 4 for Expressive paragraph text-to-speech synthesis with multi-step variational autoencoder

Neural networks have been able to generate high-quality single-sentence speech with substantial expressiveness. However, it remains a challenge concerning paragraph-level speech synthesis due to the need for coherent acoustic features while delivering fluctuating speech styles. Meanwhile, training these models directly on over-length speech leads to a deterioration in the quality of synthesis speech. To address these problems, we propose a high-quality and expressive paragraph speech synthesis system with a multi-step variational autoencoder. Specifically, we employ multi-step latent variables to capture speech information at different grammatical levels before utilizing these features in parallel to generate speech waveform. We also propose a three-step training method to improve the decoupling ability. Our model was trained on a single-speaker French audiobook corpus released at Blizzard Challenge 2023. Experimental results underscore the significant superiority of our system over baseline models.

* 5 pages, 1 figure, 2 tables 
Viaarxiv icon