Alert button
Picture for Wei Tang

Wei Tang

Alert button

LaplaceConfidence: a Graph-based Approach for Learning with Noisy Labels

Jul 31, 2023
Mingcai Chen, Yuntao Du, Wei Tang, Baoming Zhang, Hao Cheng, Shuwei Qian, Chongjun Wang

Figure 1 for LaplaceConfidence: a Graph-based Approach for Learning with Noisy Labels
Figure 2 for LaplaceConfidence: a Graph-based Approach for Learning with Noisy Labels
Figure 3 for LaplaceConfidence: a Graph-based Approach for Learning with Noisy Labels
Figure 4 for LaplaceConfidence: a Graph-based Approach for Learning with Noisy Labels

In real-world applications, perfect labels are rarely available, making it challenging to develop robust machine learning algorithms that can handle noisy labels. Recent methods have focused on filtering noise based on the discrepancy between model predictions and given noisy labels, assuming that samples with small classification losses are clean. This work takes a different approach by leveraging the consistency between the learned model and the entire noisy dataset using the rich representational and topological information in the data. We introduce LaplaceConfidence, a method that to obtain label confidence (i.e., clean probabilities) utilizing the Laplacian energy. Specifically, it first constructs graphs based on the feature representations of all noisy samples and minimizes the Laplacian energy to produce a low-energy graph. Clean labels should fit well into the low-energy graph while noisy ones should not, allowing our method to determine data's clean probabilities. Furthermore, LaplaceConfidence is embedded into a holistic method for robust training, where co-training technique generates unbiased label confidence and label refurbishment technique better utilizes it. We also explore the dimensionality reduction technique to accommodate our method on large-scale noisy datasets. Our experiments demonstrate that LaplaceConfidence outperforms state-of-the-art methods on benchmark datasets under both synthetic and real-world noise.

Viaarxiv icon

Deeply Coupled Cross-Modal Prompt Learning

May 30, 2023
Xuejing Liu, Wei Tang, Jinghui Lu, Rui Zhao, Zhaojun Guo, Fei Tan

Figure 1 for Deeply Coupled Cross-Modal Prompt Learning
Figure 2 for Deeply Coupled Cross-Modal Prompt Learning
Figure 3 for Deeply Coupled Cross-Modal Prompt Learning
Figure 4 for Deeply Coupled Cross-Modal Prompt Learning

Recent advancements in multimodal foundation models (e.g., CLIP) have excelled in zero-shot generalization. Prompt tuning involved in the knowledge transfer from foundation models to downstream tasks has gained significant attention recently. Existing prompt-tuning methods in cross-modal learning, however, either solely focus on language branch, or learn vision-language interaction in a shallow mechanism. In this context, we propose a Deeply coupled Cross-modal Prompt learning (DCP) method based on CLIP. DCP flexibly accommodates the interplay between vision and language with a Cross-Modal Prompt Attention (CMPA) mechanism, which enables the mutual exchange of respective representation through a well-connected multi-head attention module progressively and strongly. We then conduct comprehensive few-shot learning experiments on 11 image classification datasets and analyze the robustness to domain shift as well. Thorough experimental analysis evidently demonstrates the superb few-shot generalization and compelling domain adaption capacity of a well-executed DCP. The code can be found at https://github.com/GingL/CMPA.

* Accepted by ACL 2023 findings 
Viaarxiv icon

Disambiguated Attention Embedding for Multi-Instance Partial-Label Learning

May 26, 2023
Wei Tang, Weijia Zhang, Min-Ling Zhang

Figure 1 for Disambiguated Attention Embedding for Multi-Instance Partial-Label Learning
Figure 2 for Disambiguated Attention Embedding for Multi-Instance Partial-Label Learning
Figure 3 for Disambiguated Attention Embedding for Multi-Instance Partial-Label Learning
Figure 4 for Disambiguated Attention Embedding for Multi-Instance Partial-Label Learning

In many real-world tasks, the concerned objects can be represented as a multi-instance bag associated with a candidate label set, which consists of one ground-truth label and several false positive labels. Multi-instance partial-label learning (MIPL) is a learning paradigm to deal with such tasks and has achieved favorable performances. Existing MIPL approach follows the instance-space paradigm by assigning augmented candidate label sets of bags to each instance and aggregating bag-level labels from instance-level labels. However, this scheme may be suboptimal as global bag-level information is ignored and the predicted labels of bags are sensitive to predictions of negative instances. In this paper, we study an alternative scheme where a multi-instance bag is embedded into a single vector representation. Accordingly, an intuitive algorithm named DEMIPL, i.e., Disambiguated attention Embedding for Multi-Instance Partial-Label learning, is proposed. DEMIPL employs a disambiguation attention mechanism to aggregate a multi-instance bag into a single vector representation, followed by a momentum-based disambiguation strategy to identify the ground-truth label from the candidate label set. Furthermore, we introduce a real-world MIPL dataset for colorectal cancer classification. Experimental results on benchmark and real-world datasets validate the superiority of DEMIPL against other well-established MIPL and partial-label learning methods. Our code and datasets will be made publicly available.

Viaarxiv icon

Performative Prediction with Bandit Feedback: Learning through Reparameterization

May 08, 2023
Yatong Chen, Wei Tang, Chien-Ju Ho, Yang Liu

Figure 1 for Performative Prediction with Bandit Feedback: Learning through Reparameterization
Figure 2 for Performative Prediction with Bandit Feedback: Learning through Reparameterization

Performative prediction, as introduced by Perdomo et al. (2020), is a framework for studying social prediction in which the data distribution itself changes in response to the deployment of a model. Existing work on optimizing accuracy in this setting hinges on two assumptions that are easily violated in practice: that the performative risk is convex over the deployed model, and that the mapping from the model to the data distribution is known to the model designer in advance. In this paper, we initiate the study of tractable performative prediction problems that do not require these assumptions. To tackle this more challenging setting, we develop a two-level zeroth-order optimization algorithm, where one level aims to compute the distribution map, and the other level reparameterizes the performative prediction objective as a function of the induced data distribution. Under mild conditions, this reparameterization allows us to transform the non-convex objective into a convex one and achieve provable regret guarantees. In particular, we provide a regret bound that is sublinear in the total number of performative samples taken and only polynomial in the dimension of the model parameter.

Viaarxiv icon

Dynamic Pricing and Learning with Bayesian Persuasion

Apr 27, 2023
Shipra Agrawal, Yiding Feng, Wei Tang

Figure 1 for Dynamic Pricing and Learning with Bayesian Persuasion

We consider a novel dynamic pricing and learning setting where in addition to setting prices of products in sequential rounds, the seller also ex-ante commits to 'advertising schemes'. That is, in the beginning of each round the seller can decide what kind of signal they will provide to the buyer about the product's quality upon realization. Using the popular Bayesian persuasion framework to model the effect of these signals on the buyers' valuation and purchase responses, we formulate the problem of finding an optimal design of the advertising scheme along with a pricing scheme that maximizes the seller's expected revenue. Without any apriori knowledge of the buyers' demand function, our goal is to design an online algorithm that can use past purchase responses to adaptively learn the optimal pricing and advertising strategy. We study the regret of the algorithm when compared to the optimal clairvoyant price and advertising scheme. Our main result is a computationally efficient online algorithm that achieves an $O(T^{2/3}(m\log T)^{1/3})$ regret bound when the valuation function is linear in the product quality. Here $m$ is the cardinality of the discrete product quality domain and $T$ is the time horizon. This result requires some natural monotonicity and Lipschitz assumptions on the valuation function, but no Lipschitz or smoothness assumption on the buyers' demand function. For constant $m$, our result matches the regret lower bound for dynamic pricing within logarithmic factors, which is a special case of our problem. We also obtain several improved results for the widely considered special case of additive valuations, including an $\tilde{O}(T^{2/3})$ regret bound independent of $m$ when $m\le T^{1/3}$.

Viaarxiv icon

MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking

Mar 18, 2023
Zheng Qin, Sanping Zhou, Le Wang, Jinghai Duan, Gang Hua, Wei Tang

Figure 1 for MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking
Figure 2 for MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking
Figure 3 for MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking
Figure 4 for MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking

The main challenge of Multi-Object Tracking~(MOT) lies in maintaining a continuous trajectory for each target. Existing methods often learn reliable motion patterns to match the same target between adjacent frames and discriminative appearance features to re-identify the lost targets after a long period. However, the reliability of motion prediction and the discriminability of appearances can be easily hurt by dense crowds and extreme occlusions in the tracking process. In this paper, we propose a simple yet effective multi-object tracker, i.e., MotionTrack, which learns robust short-term and long-term motions in a unified framework to associate trajectories from a short to long range. For dense crowds, we design a novel Interaction Module to learn interaction-aware motions from short-term trajectories, which can estimate the complex movement of each target. For extreme occlusions, we build a novel Refind Module to learn reliable long-term motions from the target's history trajectory, which can link the interrupted trajectory with its corresponding detection. Our Interaction Module and Refind Module are embedded in the well-known tracking-by-detection paradigm, which can work in tandem to maintain superior performance. Extensive experimental results on MOT17 and MOT20 datasets demonstrate the superiority of our approach in challenging scenarios, and it achieves state-of-the-art performances at various MOT metrics.

* Accepted by CVPR2023! 
Viaarxiv icon

A Global and Patch-wise Contrastive Loss for Accurate Automated Exudate Detection

Feb 22, 2023
Wei Tang, Yinxiao Wang, Kangning Cui, Raymond H. Chan

Figure 1 for A Global and Patch-wise Contrastive Loss for Accurate Automated Exudate Detection
Figure 2 for A Global and Patch-wise Contrastive Loss for Accurate Automated Exudate Detection
Figure 3 for A Global and Patch-wise Contrastive Loss for Accurate Automated Exudate Detection
Figure 4 for A Global and Patch-wise Contrastive Loss for Accurate Automated Exudate Detection

Diabetic retinopathy (DR) is a leading cause of blindness worldwide. Early diagnosis is essential in the treatment of diabetes and can assist in preventing vision impairment. Since manual annotation of medical images is time-consuming, costly, and prone to subjectivity that leads to inconsistent diagnoses, several deep learning segmentation approaches have been proposed to address these challenges. However, these networks often rely on simple loss functions, such as binary cross entropy (BCE), which may not be sophisticated enough to effectively segment lesions such as those present in DR. In this paper, we propose a loss function that incorporates a global segmentation loss, a patch-wise density loss, and a patch-wise edge-aware loss to improve the performance of these networks on the detection and segmentation of hard exudates. Comparing our proposed loss function against the BCE loss on several state-of-the-art networks, our experimental results reveal substantial improvement in network performance achieved by incorporating the patch-wise contrastive loss.

* 8 pages, 3 figures, 1 table 
Viaarxiv icon

Multi-Instance Partial-Label Learning: Towards Exploiting Dual Inexact Supervision

Dec 18, 2022
Wei Tang, Weijia Zhang, Min-Ling Zhang

Figure 1 for Multi-Instance Partial-Label Learning: Towards Exploiting Dual Inexact Supervision
Figure 2 for Multi-Instance Partial-Label Learning: Towards Exploiting Dual Inexact Supervision
Figure 3 for Multi-Instance Partial-Label Learning: Towards Exploiting Dual Inexact Supervision
Figure 4 for Multi-Instance Partial-Label Learning: Towards Exploiting Dual Inexact Supervision

Weakly supervised machine learning algorithms are able to learn from ambiguous samples or labels, e.g., multi-instance learning or partial-label learning. However, in some real-world tasks, each training sample is associated with not only multiple instances but also a candidate label set that contains one ground-truth label and some false positive labels. Specifically, at least one instance pertains to the ground-truth label while no instance belongs to the false positive labels. In this paper, we formalize such problems as multi-instance partial-label learning (MIPL). Existing multi-instance learning algorithms and partial-label learning algorithms are suboptimal for solving MIPL problems since the former fail to disambiguate a candidate label set, and the latter cannot handle a multi-instance bag. To address these issues, a tailored algorithm named MIPLGP, i.e., Multi-Instance Partial-Label learning with Gaussian Processes, is proposed. MIPLGP first assigns each instance with a candidate label set in an augmented label space, then transforms the candidate label set into a logarithmic space to yield the disambiguated and continuous labels via an exclusive disambiguation strategy, and last induces a model based on the Gaussian processes. Experimental results on various datasets validate that MIPLGP is superior to well-established multi-instance learning and partial-label learning algorithms for solving MIPL problems. Our code and datasets will be made publicly available.

Viaarxiv icon

Be Careful with Rotation: A Uniform Backdoor Pattern for 3D Shape

Dec 01, 2022
Linkun Fan, Fazhi He, Qing Guo, Wei Tang, Xiaolin Hong, Bing Li

Figure 1 for Be Careful with Rotation: A Uniform Backdoor Pattern for 3D Shape
Figure 2 for Be Careful with Rotation: A Uniform Backdoor Pattern for 3D Shape
Figure 3 for Be Careful with Rotation: A Uniform Backdoor Pattern for 3D Shape
Figure 4 for Be Careful with Rotation: A Uniform Backdoor Pattern for 3D Shape

For saving cost, many deep neural networks (DNNs) are trained on third-party datasets downloaded from internet, which enables attacker to implant backdoor into DNNs. In 2D domain, inherent structures of different image formats are similar. Hence, backdoor attack designed for one image format will suite for others. However, when it comes to 3D world, there is a huge disparity among different 3D data structures. As a result, backdoor pattern designed for one certain 3D data structure will be disable for other data structures of the same 3D scene. Therefore, this paper designs a uniform backdoor pattern: NRBdoor (Noisy Rotation Backdoor) which is able to adapt for heterogeneous 3D data structures. Specifically, we start from the unit rotation and then search for the optimal pattern by noise generation and selection process. The proposed NRBdoor is natural and imperceptible, since rotation is a common operation which usually contains noise due to both the miss match between a pair of points and the sensor calibration error for real-world 3D scene. Extensive experiments on 3D mesh and point cloud show that the proposed NRBdoor achieves state-of-the-art performance, with negligible shape variation.

Viaarxiv icon