Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yiming Qian

Test-Time Adaptation for Point Cloud Upsampling Using Meta-Learning

Sep 01, 2023

Ahmed Hatem, Yiming Qian, Yang Wang

Figure 1 for Test-Time Adaptation for Point Cloud Upsampling Using Meta-Learning

Figure 2 for Test-Time Adaptation for Point Cloud Upsampling Using Meta-Learning

Figure 3 for Test-Time Adaptation for Point Cloud Upsampling Using Meta-Learning

Figure 4 for Test-Time Adaptation for Point Cloud Upsampling Using Meta-Learning

Abstract:Affordable 3D scanners often produce sparse and non-uniform point clouds that negatively impact downstream applications in robotic systems. While existing point cloud upsampling architectures have demonstrated promising results on standard benchmarks, they tend to experience significant performance drops when the test data have different distributions from the training data. To address this issue, this paper proposes a test-time adaption approach to enhance model generality of point cloud upsampling. The proposed approach leverages meta-learning to explicitly learn network parameters for test-time adaption. Our method does not require any prior information about the test data. During meta-training, the model parameters are learned from a collection of instance-level tasks, each of which consists of a sparse-dense pair of point clouds from the training data. During meta-testing, the trained model is fine-tuned with a few gradient updates to produce a unique set of network parameters for each test instance. The updated model is then used for the final prediction. Our framework is generic and can be applied in a plug-and-play manner with existing backbone networks in point cloud upsampling. Extensive experiments demonstrate that our approach improves the performance of state-of-the-art models.

* Accepted at IROS 2023

Via

Access Paper or Ask Questions

Point-TTA: Test-Time Adaptation for Point Cloud Registration Using Multitask Meta-Auxiliary Learning

Sep 01, 2023

Ahmed Hatem, Yiming Qian, Yang Wang

Figure 1 for Point-TTA: Test-Time Adaptation for Point Cloud Registration Using Multitask Meta-Auxiliary Learning

Figure 2 for Point-TTA: Test-Time Adaptation for Point Cloud Registration Using Multitask Meta-Auxiliary Learning

Figure 3 for Point-TTA: Test-Time Adaptation for Point Cloud Registration Using Multitask Meta-Auxiliary Learning

Figure 4 for Point-TTA: Test-Time Adaptation for Point Cloud Registration Using Multitask Meta-Auxiliary Learning

Abstract:We present Point-TTA, a novel test-time adaptation framework for point cloud registration (PCR) that improves the generalization and the performance of registration models. While learning-based approaches have achieved impressive progress, generalization to unknown testing environments remains a major challenge due to the variations in 3D scans. Existing methods typically train a generic model and the same trained model is applied on each instance during testing. This could be sub-optimal since it is difficult for the same model to handle all the variations during testing. In this paper, we propose a test-time adaptation approach for PCR. Our model can adapt to unseen distributions at test-time without requiring any prior knowledge of the test data. Concretely, we design three self-supervised auxiliary tasks that are optimized jointly with the primary PCR task. Given a test instance, we adapt our model using these auxiliary tasks and the updated model is used to perform the inference. During training, our model is trained using a meta-auxiliary learning approach, such that the adapted model via auxiliary tasks improves the accuracy of the primary task. Experimental results demonstrate the effectiveness of our approach in improving generalization of point cloud registration and outperforming other state-of-the-art approaches.

* Accepted at ICCV 2023

Via

Access Paper or Ask Questions

IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning

Aug 23, 2023

Feiyu Zhang, Liangzhi Li, Junhao Chen, Zhouqiang Jiang, Bowen Wang, Yiming Qian

Figure 1 for IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning

Figure 2 for IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning

Figure 3 for IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning

Figure 4 for IncreLoRA: Incremental Parameter Allocation Method for Parameter-Efficient Fine-tuning

Abstract:With the increasing size of pre-trained language models (PLMs), fine-tuning all the parameters in the model is not efficient, especially when there are a large number of downstream tasks, which incur significant training and storage costs. Many parameter-efficient fine-tuning (PEFT) approaches have been proposed, among which, Low-Rank Adaptation (LoRA) is a representative approach that injects trainable rank decomposition matrices into every target module. Yet LoRA ignores the importance of parameters in different modules. To address this problem, many works have been proposed to prune the parameters of LoRA. However, under limited training conditions, the upper bound of the rank of the pruned parameter matrix is still affected by the preset values. We, therefore, propose IncreLoRA, an incremental parameter allocation method that adaptively adds trainable parameters during training based on the importance scores of each module. This approach is different from the pruning method as it is not limited by the initial number of training parameters, and each parameter matrix has a higher rank upper bound for the same training overhead. We conduct extensive experiments on GLUE to demonstrate the effectiveness of IncreLoRA. The results show that our method owns higher parameter efficiency, especially when under the low-resource settings where our method significantly outperforms the baselines. Our code is publicly available.

Via

Access Paper or Ask Questions

RoSI: Recovering 3D Shape Interiors from Few Articulation Images

Apr 13, 2023

Akshay Gadi Patil, Yiming Qian, Shan Yang, Brian Jackson, Eric Bennett, Hao Zhang

Figure 1 for RoSI: Recovering 3D Shape Interiors from Few Articulation Images

Figure 2 for RoSI: Recovering 3D Shape Interiors from Few Articulation Images

Figure 3 for RoSI: Recovering 3D Shape Interiors from Few Articulation Images

Figure 4 for RoSI: Recovering 3D Shape Interiors from Few Articulation Images

Abstract:The dominant majority of 3D models that appear in gaming, VR/AR, and those we use to train geometric deep learning algorithms are incomplete, since they are modeled as surface meshes and missing their interior structures. We present a learning framework to recover the shape interiors (RoSI) of existing 3D models with only their exteriors from multi-view and multi-articulation images. Given a set of RGB images that capture a target 3D object in different articulated poses, possibly from only few views, our method infers the interior planes that are observable in the input images. Our neural architecture is trained in a category-agnostic manner and it consists of a motion-aware multi-view analysis phase including pose, depth, and motion estimations, followed by interior plane detection in images and 3D space, and finally multi-view plane fusion. In addition, our method also predicts part articulations and is able to realize and even extrapolate the captured motions on the target 3D object. We evaluate our method by quantitative and qualitative comparisons to baselines and alternative solutions, as well as testing on untrained object categories and real image inputs to assess its generalization capabilities.

Via

Access Paper or Ask Questions

Uncertainty-inspired Open Set Learning for Retinal Anomaly Identification

Apr 08, 2023

Meng Wang, Tian Lin, Lianyu Wang, Aidi Lin, Ke Zou, Xinxing Xu, Yi Zhou, Yuanyuan Peng, Qingquan Meng, Yiming Qian(+14 more)

Figure 1 for Uncertainty-inspired Open Set Learning for Retinal Anomaly Identification

Figure 2 for Uncertainty-inspired Open Set Learning for Retinal Anomaly Identification

Figure 3 for Uncertainty-inspired Open Set Learning for Retinal Anomaly Identification

Figure 4 for Uncertainty-inspired Open Set Learning for Retinal Anomaly Identification

Abstract:Failure to recognize samples from the classes unseen during training is a major limit of artificial intelligence (AI) in real-world implementation of retinal anomaly classification. To resolve this obstacle, we propose an uncertainty-inspired open-set (UIOS) model which was trained with fundus images of 9 common retinal conditions. Besides the probability of each category, UIOS also calculates an uncertainty score to express its confidence. Our UIOS model with thresholding strategy achieved an F1 score of 99.55%, 97.01% and 91.91% for the internal testing set, external testing set and non-typical testing set, respectively, compared to the F1 score of 92.20%, 80.69% and 64.74% by the standard AI model. Furthermore, UIOS correctly predicted high uncertainty scores, which prompted the need for a manual check, in the datasets of rare retinal diseases, low-quality fundus images, and non-fundus images. This work provides a robust method for real-world screening of retinal anomalies.

Via

Access Paper or Ask Questions

Learning to Recover Spectral Reflectance from RGB Images

Apr 04, 2023

Dong Huo, Jian Wang, Yiming Qian, Yee-Hong Yang

Figure 1 for Learning to Recover Spectral Reflectance from RGB Images

Figure 2 for Learning to Recover Spectral Reflectance from RGB Images

Figure 3 for Learning to Recover Spectral Reflectance from RGB Images

Figure 4 for Learning to Recover Spectral Reflectance from RGB Images

Abstract:This paper tackles spectral reflectance recovery (SRR) from RGB images. Since capturing ground-truth spectral reflectance and camera spectral sensitivity are challenging and costly, most existing approaches are trained on synthetic images and utilize the same parameters for all unseen testing images, which are suboptimal especially when the trained models are tested on real images because they never exploit the internal information of the testing images. To address this issue, we adopt a self-supervised meta-auxiliary learning (MAXL) strategy that fine-tunes the well-trained network parameters with each testing image to combine external with internal information. To the best of our knowledge, this is the first work that successfully adapts the MAXL strategy to this problem. Instead of relying on naive end-to-end training, we also propose a novel architecture that integrates the physical relationship between the spectral reflectance and the corresponding RGB images into the network based on our mathematical analysis. Besides, since the spectral reflectance of a scene is independent to its illumination while the corresponding RGB images are not, we recover the spectral reflectance of a scene from its RGB images captured under multiple illuminations to further reduce the unknown. Qualitative and quantitative evaluations demonstrate the effectiveness of our proposed network and of the MAXL. Our code and data are available at https://github.com/Dong-Huo/SRR-MAXL.

Via

Access Paper or Ask Questions

Federated Uncertainty-Aware Aggregation for Fundus Diabetic Retinopathy Staging

Mar 23, 2023

Meng Wang, Lianyu Wang, Xinxing Xu, Ke Zou, Yiming Qian, Rick Siow Mong Goh, Yong Liu, Huazhu Fu

Abstract:Deep learning models have shown promising performance in the field of diabetic retinopathy (DR) staging. However, collaboratively training a DR staging model across multiple institutions remains a challenge due to non-iid data, client reliability, and confidence evaluation of the prediction. To address these issues, we propose a novel federated uncertainty-aware aggregation paradigm (FedUAA), which considers the reliability of each client and produces a confidence estimation for the DR staging. In our FedUAA, an aggregated encoder is shared by all clients for learning a global representation of fundus images, while a novel temperature-warmed uncertainty head (TWEU) is utilized for each client for local personalized staging criteria. Our TWEU employs an evidential deep layer to produce the uncertainty score with the DR staging results for client reliability evaluation. Furthermore, we developed a novel uncertainty-aware weighting module (UAW) to dynamically adjust the weights of model aggregation based on the uncertainty score distribution of each client. In our experiments, we collect five publicly available datasets from different institutions to conduct a dataset for federated DR staging to satisfy the real non-iid condition. The experimental results demonstrate that our FedUAA achieves better DR staging performance with higher reliability compared to other federated learning methods. Our proposed FedUAA paradigm effectively addresses the challenges of collaboratively training DR staging models across multiple institutions, and provides a robust and reliable solution for the deployment of DR diagnosis models in real-world clinical scenarios.

Via

Access Paper or Ask Questions

Inference Time Evidences of Adversarial Attacks for Forensic on Transformers

Jan 31, 2023

Hugo Lemarchant, Liangzi Li, Yiming Qian, Yuta Nakashima, Hajime Nagahara

Figure 1 for Inference Time Evidences of Adversarial Attacks for Forensic on Transformers

Figure 2 for Inference Time Evidences of Adversarial Attacks for Forensic on Transformers

Figure 3 for Inference Time Evidences of Adversarial Attacks for Forensic on Transformers

Figure 4 for Inference Time Evidences of Adversarial Attacks for Forensic on Transformers

Abstract:Vision Transformers (ViTs) are becoming a very popular paradigm for vision tasks as they achieve state-of-the-art performance on image classification. However, although early works implied that this network structure had increased robustness against adversarial attacks, some works argue ViTs are still vulnerable. This paper presents our first attempt toward detecting adversarial attacks during inference time using the network's input and outputs as well as latent features. We design four quantifications (or derivatives) of input, output, and latent vectors of ViT-based models that provide a signature of the inference, which could be beneficial for the attack detection, and empirically study their behavior over clean samples and adversarial samples. The results demonstrate that the quantifications from input (images) and output (posterior probabilities) are promising for distinguishing clean and adversarial samples, while latent vectors offer less discriminative power, though they give some insights on how adversarial perturbations work.

Via

Access Paper or Ask Questions

TrFedDis: Trusted Federated Disentangling Network for Non-IID Domain Feature

Jan 30, 2023

Meng Wang, Kai Yu, Chun-Mei Feng, Yiming Qian, Ke Zou, Lianyu Wang, Rick Siow Mong Goh, Xinxing Xu, Yong Liu, Huazhu Fu

Figure 1 for TrFedDis: Trusted Federated Disentangling Network for Non-IID Domain Feature

Figure 2 for TrFedDis: Trusted Federated Disentangling Network for Non-IID Domain Feature

Figure 3 for TrFedDis: Trusted Federated Disentangling Network for Non-IID Domain Feature

Figure 4 for TrFedDis: Trusted Federated Disentangling Network for Non-IID Domain Feature

Abstract:Federated learning (FL), as an effective decentralized distributed learning approach, enables multiple institutions to jointly train a model without sharing their local data. However, the domain feature shift caused by different acquisition devices/clients substantially degrades the performance of the FL model. Furthermore, most existing FL approaches aim to improve accuracy without considering reliability (e.g., confidence or uncertainty). The predictions are thus unreliable when deployed in safety-critical applications. Therefore, aiming at improving the performance of FL in non-Domain feature issues while enabling the model more reliable. In this paper, we propose a novel trusted federated disentangling network, termed TrFedDis, which utilizes feature disentangling to enable the ability to capture the global domain-invariant cross-client representation and preserve local client-specific feature learning. Meanwhile, to effectively integrate the decoupled features, an uncertainty-aware decision fusion is also introduced to guide the network for dynamically integrating the decoupled features at the evidence level, while producing a reliable prediction with an estimated uncertainty. To the best of our knowledge, our proposed TrFedDis is the first work to develop an FL approach based on evidential uncertainty combined with feature disentangling, which enhances the performance and reliability of FL in non-IID domain features. Extensive experimental results show that our proposed TrFedDis provides outstanding performance with a high degree of reliability as compared to other state-of-the-art FL approaches.

Via

Access Paper or Ask Questions

HAL3D: Hierarchical Active Learning for Fine-Grained 3D Part Labeling

Jan 25, 2023

Fenggen Yu, Yiming Qian, Francisca Gil-Ureta, Brian Jackson, Eric Bennett, Hao Zhang

Figure 1 for HAL3D: Hierarchical Active Learning for Fine-Grained 3D Part Labeling

Figure 2 for HAL3D: Hierarchical Active Learning for Fine-Grained 3D Part Labeling

Figure 3 for HAL3D: Hierarchical Active Learning for Fine-Grained 3D Part Labeling

Figure 4 for HAL3D: Hierarchical Active Learning for Fine-Grained 3D Part Labeling

Abstract:We present the first active learning tool for fine-grained 3D part labeling, a problem which challenges even the most advanced deep learning (DL) methods due to the significant structural variations among the small and intricate parts. For the same reason, the necessary data annotation effort is tremendous, motivating approaches to minimize human involvement. Our labeling tool iteratively verifies or modifies part labels predicted by a deep neural network, with human feedback continually improving the network prediction. To effectively reduce human efforts, we develop two novel features in our tool, hierarchical and symmetry-aware active labeling. Our human-in-the-loop approach, coined HAL3D, achieves 100% accuracy (barring human errors) on any test set with pre-defined hierarchical part labels, with 80% time-saving over manual effort.

Via

Access Paper or Ask Questions