Alert button
Picture for Jing Xu

Jing Xu

Alert button

Chain-of-Verification Reduces Hallucination in Large Language Models

Sep 20, 2023
Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu, Roberta Raileanu, Xian Li, Asli Celikyilmaz, Jason Weston

Generation of plausible yet incorrect factual information, termed hallucination, is an unsolved issue in large language models. We study the ability of language models to deliberate on the responses they give in order to correct their mistakes. We develop the Chain-of-Verification (CoVe) method whereby the model first (i) drafts an initial response; then (ii) plans verification questions to fact-check its draft; (iii) answers those questions independently so the answers are not biased by other responses; and (iv) generates its final verified response. In experiments, we show CoVe decreases hallucinations across a variety of tasks, from list-based questions from Wikidata, closed book MultiSpanQA and longform text generation.

Viaarxiv icon

TransTouch: Learning Transparent Objects Depth Sensing Through Sparse Touches

Sep 18, 2023
Liuyu Bian, Pengyang Shi, Weihang Chen, Jing Xu, Li Yi, Rui Chen

Transparent objects are common in daily life. However, depth sensing for transparent objects remains a challenging problem. While learning-based methods can leverage shape priors to improve the sensing quality, the labor-intensive data collection in the real world and the sim-to-real domain gap restrict these methods' scalability. In this paper, we propose a method to finetune a stereo network with sparse depth labels automatically collected using a probing system with tactile feedback. We present a novel utility function to evaluate the benefit of touches. By approximating and optimizing the utility function, we can optimize the probing locations given a fixed touching budget to better improve the network's performance on real objects. We further combine tactile depth supervision with a confidence-based regularization to prevent over-fitting during finetuning. To evaluate the effectiveness of our method, we construct a real-world dataset including both diffuse and transparent objects. Experimental results on this dataset show that our method can significantly improve real-world depth sensing accuracy, especially for transparent objects.

* Accepted to the 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 
Viaarxiv icon

BlindSage: Label Inference Attacks against Node-level Vertical Federated Graph Neural Networks

Aug 04, 2023
Marco Arazzi, Mauro Conti, Stefanos Koffas, Marina Krcek, Antonino Nocera, Stjepan Picek, Jing Xu

Federated learning enables collaborative training of machine learning models by keeping the raw data of the involved workers private. One of its main objectives is to improve the models' privacy, security, and scalability. Vertical Federated Learning (VFL) offers an efficient cross-silo setting where a few parties collaboratively train a model without sharing the same features. In such a scenario, classification labels are commonly considered sensitive information held exclusively by one (active) party, while other (passive) parties use only their local information. Recent works have uncovered important flaws of VFL, leading to possible label inference attacks under the assumption that the attacker has some, even limited, background knowledge on the relation between labels and data. In this work, we are the first (to the best of our knowledge) to investigate label inference attacks on VFL using a zero-background knowledge strategy. To concretely formulate our proposal, we focus on Graph Neural Networks (GNNs) as a target model for the underlying VFL. In particular, we refer to node classification tasks, which are widely studied, and GNNs have shown promising results. Our proposed attack, BlindSage, provides impressive results in the experiments, achieving nearly 100% accuracy in most cases. Even when the attacker has no information about the used architecture or the number of classes, the accuracy remained above 85% in most instances. Finally, we observe that well-known defenses cannot mitigate our attack without affecting the model's performance on the main classification task.

Viaarxiv icon

A Novel Multi-Task Model Imitating Dermatologists for Accurate Differential Diagnosis of Skin Diseases in Clinical Images

Jul 17, 2023
Yan-Jie Zhou, Wei Liu, Yuan Gao, Jing Xu, Le Lu, Yuping Duan, Hao Cheng, Na Jin, Xiaoyong Man, Shuang Zhao, Yu Wang

Figure 1 for A Novel Multi-Task Model Imitating Dermatologists for Accurate Differential Diagnosis of Skin Diseases in Clinical Images
Figure 2 for A Novel Multi-Task Model Imitating Dermatologists for Accurate Differential Diagnosis of Skin Diseases in Clinical Images
Figure 3 for A Novel Multi-Task Model Imitating Dermatologists for Accurate Differential Diagnosis of Skin Diseases in Clinical Images
Figure 4 for A Novel Multi-Task Model Imitating Dermatologists for Accurate Differential Diagnosis of Skin Diseases in Clinical Images

Skin diseases are among the most prevalent health issues, and accurate computer-aided diagnosis methods are of importance for both dermatologists and patients. However, most of the existing methods overlook the essential domain knowledge required for skin disease diagnosis. A novel multi-task model, namely DermImitFormer, is proposed to fill this gap by imitating dermatologists' diagnostic procedures and strategies. Through multi-task learning, the model simultaneously predicts body parts and lesion attributes in addition to the disease itself, enhancing diagnosis accuracy and improving diagnosis interpretability. The designed lesion selection module mimics dermatologists' zoom-in action, effectively highlighting the local lesion features from noisy backgrounds. Additionally, the presented cross-interaction module explicitly models the complicated diagnostic reasoning between body parts, lesion attributes, and diseases. To provide a more robust evaluation of the proposed method, a large-scale clinical image dataset of skin diseases with significantly more cases than existing datasets has been established. Extensive experiments on three different datasets consistently demonstrate the state-of-the-art recognition performance of the proposed approach.

* MICCAI 2023 early accept 
Viaarxiv icon

Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts

Jul 06, 2023
Mounica Maddela, Megan Ung, Jing Xu, Andrea Madotto, Heather Foran, Y-Lan Boureau

Figure 1 for Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts
Figure 2 for Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts
Figure 3 for Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts
Figure 4 for Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts

Many cognitive approaches to well-being, such as recognizing and reframing unhelpful thoughts, have received considerable empirical support over the past decades, yet still lack truly widespread adoption in self-help format. A barrier to that adoption is a lack of adequately specific and diverse dedicated practice material. This work examines whether current language models can be leveraged to both produce a virtually unlimited quantity of practice material illustrating standard unhelpful thought patterns matching specific given contexts, and generate suitable positive reframing proposals. We propose PATTERNREFRAME, a novel dataset of about 10k examples of thoughts containing unhelpful thought patterns conditioned on a given persona, accompanied by about 27k positive reframes. By using this dataset to train and/or evaluate current models, we show that existing models can already be powerful tools to help generate an abundance of tailored practice material and hypotheses, with no or minimal additional model training required.

* ACL 2023 
Viaarxiv icon

HODINet: High-Order Discrepant Interaction Network for RGB-D Salient Object Detection

Jul 03, 2023
Kang Yi, Jing Xu, Xiao Jin, Fu Guo, Yan-Feng Wu

Figure 1 for HODINet: High-Order Discrepant Interaction Network for RGB-D Salient Object Detection
Figure 2 for HODINet: High-Order Discrepant Interaction Network for RGB-D Salient Object Detection
Figure 3 for HODINet: High-Order Discrepant Interaction Network for RGB-D Salient Object Detection
Figure 4 for HODINet: High-Order Discrepant Interaction Network for RGB-D Salient Object Detection

RGB-D salient object detection (SOD) aims to detect the prominent regions by jointly modeling RGB and depth information. Most RGB-D SOD methods apply the same type of backbones and fusion modules to identically learn the multimodality and multistage features. However, these features contribute differently to the final saliency results, which raises two issues: 1) how to model discrepant characteristics of RGB images and depth maps; 2) how to fuse these cross-modality features in different stages. In this paper, we propose a high-order discrepant interaction network (HODINet) for RGB-D SOD. Concretely, we first employ transformer-based and CNN-based architectures as backbones to encode RGB and depth features, respectively. Then, the high-order representations are delicately extracted and embedded into spatial and channel attentions for cross-modality feature fusion in different stages. Specifically, we design a high-order spatial fusion (HOSF) module and a high-order channel fusion (HOCF) module to fuse features of the first two and the last two stages, respectively. Besides, a cascaded pyramid reconstruction network is adopted to progressively decode the fused features in a top-down pathway. Extensive experiments are conducted on seven widely used datasets to demonstrate the effectiveness of the proposed approach. We achieve competitive performance against 24 state-of-the-art methods under four evaluation metrics.

Viaarxiv icon

Searching for the Fakes: Efficient Neural Architecture Search for General Face Forgery Detection

Jun 16, 2023
Xiao Jin, Xin-Yue Mu, Jing Xu

Figure 1 for Searching for the Fakes: Efficient Neural Architecture Search for General Face Forgery Detection
Figure 2 for Searching for the Fakes: Efficient Neural Architecture Search for General Face Forgery Detection
Figure 3 for Searching for the Fakes: Efficient Neural Architecture Search for General Face Forgery Detection
Figure 4 for Searching for the Fakes: Efficient Neural Architecture Search for General Face Forgery Detection

As the saying goes, "seeing is believing". However, with the development of digital face editing tools, we can no longer trust what we can see. Although face forgery detection has made promising progress, most current methods are designed manually by human experts, which is labor-consuming. In this paper, we develop an end-to-end framework based on neural architecture search (NAS) for deepfake detection, which can automatically design network architectures without human intervention. First, a forgery-oriented search space is created to choose appropriate operations for this task. Second, we propose a novel performance estimation metric, which guides the search process to select more general models. The cross-dataset search is also considered to develop more general architectures. Eventually, we connect the cells in a cascaded pyramid way for final forgery classification. Compared with state-of-the-art networks artificially designed, our method achieves competitive performance in both in-dataset and cross-dataset scenarios.

Viaarxiv icon

Improving Open Language Models by Learning from Organic Interactions

Jun 07, 2023
Jing Xu, Da Ju, Joshua Lane, Mojtaba Komeili, Eric Michael Smith, Megan Ung, Morteza Behrooz, William Ngan, Rashel Moritz, Sainbayar Sukhbaatar, Y-Lan Boureau, Jason Weston, Kurt Shuster

Figure 1 for Improving Open Language Models by Learning from Organic Interactions
Figure 2 for Improving Open Language Models by Learning from Organic Interactions
Figure 3 for Improving Open Language Models by Learning from Organic Interactions
Figure 4 for Improving Open Language Models by Learning from Organic Interactions

We present BlenderBot 3x, an update on the conversational model BlenderBot 3, which is now trained using organic conversation and feedback data from participating users of the system in order to improve both its skills and safety. We are publicly releasing the participating de-identified interaction data for use by the research community, in order to spur further progress. Training models with organic data is challenging because interactions with people "in the wild" include both high quality conversations and feedback, as well as adversarial and toxic behavior. We study techniques that enable learning from helpful teachers while avoiding learning from people who are trying to trick the model into unhelpful or toxic responses. BlenderBot 3x is both preferred in conversation to BlenderBot 3, and is shown to produce safer responses in challenging situations. While our current models are still far from perfect, we believe further improvement can be achieved by continued use of the techniques explored in this work.

Viaarxiv icon

Quantifying the Variability Collapse of Neural Networks

Jun 06, 2023
Jing Xu, Haoxiong Liu

Figure 1 for Quantifying the Variability Collapse of Neural Networks
Figure 2 for Quantifying the Variability Collapse of Neural Networks
Figure 3 for Quantifying the Variability Collapse of Neural Networks
Figure 4 for Quantifying the Variability Collapse of Neural Networks

Recent studies empirically demonstrate the positive relationship between the transferability of neural networks and the within-class variation of the last layer features. The recently discovered Neural Collapse (NC) phenomenon provides a new perspective of understanding such last layer geometry of neural networks. In this paper, we propose a novel metric, named Variability Collapse Index (VCI), to quantify the variability collapse phenomenon in the NC paradigm. The VCI metric is well-motivated and intrinsically related to the linear probing loss on the last layer features. Moreover, it enjoys desired theoretical and empirical properties, including invariance under invertible linear transformations and numerical stability, that distinguishes it from previous metrics. Our experiments verify that VCI is indicative of the variability collapse and the transferability of pretrained neural networks.

Viaarxiv icon