Alert button
Picture for Gang Li

Gang Li

Alert button

Cartoondiff: Training-free Cartoon Image Generation with Diffusion Transformer Models

Sep 15, 2023
Feihong He, Gang Li, Lingyu Si, Leilei Yan, Shimeng Hou, Hongwei Dong, Fanzhang Li

Image cartoonization has attracted significant interest in the field of image generation. However, most of the existing image cartoonization techniques require re-training models using images of cartoon style. In this paper, we present CartoonDiff, a novel training-free sampling approach which generates image cartoonization using diffusion transformer models. Specifically, we decompose the reverse process of diffusion models into the semantic generation phase and the detail generation phase. Furthermore, we implement the image cartoonization process by normalizing high-frequency signal of the noisy image in specific denoising steps. CartoonDiff doesn't require any additional reference images, complex model designs, or the tedious adjustment of multiple parameters. Extensive experimental results show the powerful ability of our CartoonDiff. The project page is available at: https://cartoondiff.github.io/

* 5 pages,5 figures 
Viaarxiv icon

Early Autism Diagnosis based on Path Signature and Siamese Unsupervised Feature Compressor

Jul 12, 2023
Zhuowen Yin, Xinyao Ding, Xin Zhang, Zhengwang Wu, Li Wang, Gang Li

Figure 1 for Early Autism Diagnosis based on Path Signature and Siamese Unsupervised Feature Compressor
Figure 2 for Early Autism Diagnosis based on Path Signature and Siamese Unsupervised Feature Compressor
Figure 3 for Early Autism Diagnosis based on Path Signature and Siamese Unsupervised Feature Compressor
Figure 4 for Early Autism Diagnosis based on Path Signature and Siamese Unsupervised Feature Compressor

Autism Spectrum Disorder (ASD) has been emerging as a growing public health threat. Early diagnosis of ASD is crucial for timely, effective intervention and treatment. However, conventional diagnosis methods based on communications and behavioral patterns are unreliable for children younger than 2 years of age. Given evidences of neurodevelopmental abnormalities in ASD infants, we resort to a novel deep learning-based method to extract key features from the inherently scarce, class-imbalanced, and heterogeneous structural MR images for early autism diagnosis. Specifically, we propose a Siamese verification framework to extend the scarce data, and an unsupervised compressor to alleviate data imbalance by extracting key features. We also proposed weight constraints to cope with sample heterogeneity by giving different samples different voting weights during validation, and we used Path Signature to unravel meaningful developmental features from the two-time point data longitudinally. Extensive experiments have shown that our method performed well under practical scenarios, transcending existing machine learning methods.

Viaarxiv icon

Artificial General Intelligence for Medical Imaging

Jul 03, 2023
Xiang Li, Lu Zhang, Zihao Wu, Zhengliang Liu, Lin Zhao, Yixuan Yuan, Jun Liu, Gang Li, Dajiang Zhu, Pingkun Yan, Quanzheng Li, Wei Liu, Tianming Liu, Dinggang Shen

Figure 1 for Artificial General Intelligence for Medical Imaging

In this review, we explore the potential applications of Artificial General Intelligence (AGI) models in healthcare, focusing on foundational Large Language Models (LLMs), Large Vision Models, and Large Multimodal Models. We emphasize the importance of integrating clinical expertise, domain knowledge, and multimodal capabilities into AGI models. In addition, we lay out key roadmaps that guide the development and deployment of healthcare AGI models. Throughout the review, we provide critical perspectives on the potential challenges and pitfalls associated with deploying large-scale AGI models in the medical field. This comprehensive review aims to offer insights into the future implications of AGI in medical imaging, healthcare and beyond.

Viaarxiv icon

Improving Federated Aggregation with Deep Unfolding Networks

Jun 30, 2023
Shanika I Nanayakkara, Shiva Raj Pokhrel, Gang Li

Figure 1 for Improving Federated Aggregation with Deep Unfolding Networks
Figure 2 for Improving Federated Aggregation with Deep Unfolding Networks
Figure 3 for Improving Federated Aggregation with Deep Unfolding Networks
Figure 4 for Improving Federated Aggregation with Deep Unfolding Networks

The performance of Federated learning (FL) is negatively affected by device differences and statistical characteristics between participating clients. To address this issue, we introduce a deep unfolding network (DUN)-based technique that learns adaptive weights that unbiasedly ameliorate the adverse impacts of heterogeneity. The proposed method demonstrates impressive accuracy and quality-aware aggregation. Furthermore, it evaluated the best-weighted normalization approach to define less computational power on the aggregation method. The numerical experiments in this study demonstrate the effectiveness of this approach and provide insights into the interpretability of the unbiased weights learned. By incorporating unbiased weights into the model, the proposed approach effectively addresses quality-aware aggregation under the heterogeneity of the participating clients and the FL environment. Codes and details are \href{https://github.com/shanikairoshi/Improved_DUN_basedFL_Aggregation}{here}.

* 5 pages, 3 figures and submitted to IEEE Network Letters 
Viaarxiv icon

Quantum Federated Learning: Analysis, Design and Implementation Challenges

Jun 27, 2023
Dev Gurung, Shiva Raj Pokhrel, Gang Li

Figure 1 for Quantum Federated Learning: Analysis, Design and Implementation Challenges
Figure 2 for Quantum Federated Learning: Analysis, Design and Implementation Challenges
Figure 3 for Quantum Federated Learning: Analysis, Design and Implementation Challenges
Figure 4 for Quantum Federated Learning: Analysis, Design and Implementation Challenges

Quantum Federated Learning (QFL) has gained significant attention due to quantum computing and machine learning advancements. As the demand for QFL continues to surge, there is a pressing need to comprehend its intricacies in distributed environments. This paper aims to provide a comprehensive overview of the current state of QFL, addressing a crucial knowledge gap in the existing literature. We develop ideas for new QFL frameworks, explore diverse use cases of applications, and consider the critical factors influencing their design. The technical contributions and limitations of various QFL research projects are examined while presenting future research directions and open questions for further exploration.

Viaarxiv icon

Decentralized Quantum Federated Learning for Metaverse: Analysis, Design and Implementation

Jun 20, 2023
Dev Gurung, Shiva Raj Pokhrel, Gang Li

Figure 1 for Decentralized Quantum Federated Learning for Metaverse: Analysis, Design and Implementation
Figure 2 for Decentralized Quantum Federated Learning for Metaverse: Analysis, Design and Implementation
Figure 3 for Decentralized Quantum Federated Learning for Metaverse: Analysis, Design and Implementation
Figure 4 for Decentralized Quantum Federated Learning for Metaverse: Analysis, Design and Implementation

With the emerging developments of the Metaverse, a virtual world where people can interact, socialize, play, and conduct their business, it has become critical to ensure that the underlying systems are transparent, secure, and trustworthy. To this end, we develop a decentralized and trustworthy quantum federated learning (QFL) framework. The proposed QFL leverages the power of blockchain to create a secure and transparent system that is robust against cyberattacks and fraud. In addition, the decentralized QFL system addresses the risks associated with a centralized server-based approach. With extensive experiments and analysis, we evaluate classical federated learning (CFL) and QFL in a distributed setting and demonstrate the practicality and benefits of the proposed design. Our theoretical analysis and discussions develop a genuinely decentralized financial system essential for the Metaverse. Furthermore, we present the application of blockchain-based QFL in a hybrid metaverse powered by a metaverse observer and world model. Our implementation details and code are publicly available 1.

Viaarxiv icon

LibAUC: A Deep Learning Library for X-Risk Optimization

Jun 05, 2023
Zhuoning Yuan, Dixian Zhu, Zi-Hao Qiu, Gang Li, Xuanhui Wang, Tianbao Yang

Figure 1 for LibAUC: A Deep Learning Library for X-Risk Optimization
Figure 2 for LibAUC: A Deep Learning Library for X-Risk Optimization
Figure 3 for LibAUC: A Deep Learning Library for X-Risk Optimization
Figure 4 for LibAUC: A Deep Learning Library for X-Risk Optimization

This paper introduces the award-winning deep learning (DL) library called LibAUC for implementing state-of-the-art algorithms towards optimizing a family of risk functions named X-risks. X-risks refer to a family of compositional functions in which the loss function of each data point is defined in a way that contrasts the data point with a large number of others. They have broad applications in AI for solving classical and emerging problems, including but not limited to classification for imbalanced data (CID), learning to rank (LTR), and contrastive learning of representations (CLR). The motivation of developing LibAUC is to address the convergence issues of existing libraries for solving these problems. In particular, existing libraries may not converge or require very large mini-batch sizes in order to attain good performance for these problems, due to the usage of the standard mini-batch technique in the empirical risk minimization (ERM) framework. Our library is for deep X-risk optimization (DXO) that has achieved great success in solving a variety of tasks for CID, LTR and CLR. The contributions of this paper include: (1) It introduces a new mini-batch based pipeline for implementing DXO algorithms, which differs from existing DL pipeline in the design of controlled data samplers and dynamic mini-batch losses; (2) It provides extensive benchmarking experiments for ablation studies and comparison with existing libraries. The LibAUC library features scalable performance for millions of items to be contrasted, faster and better convergence than existing libraries for optimizing X-risks, seamless PyTorch deployment and versatile APIs for various loss optimization. Our library is available to the open source community at https://github.com/Optimization-AI/LibAUC, to facilitate further academic research and industrial applications.

* Accepted by KDD2023 
Viaarxiv icon

PaLI-X: On Scaling up a Multilingual Vision and Language Model

May 29, 2023
Xi Chen, Josip Djolonga, Piotr Padlewski, Basil Mustafa, Soravit Changpinyo, Jialin Wu, Carlos Riquelme Ruiz, Sebastian Goodman, Xiao Wang, Yi Tay, Siamak Shakeri, Mostafa Dehghani, Daniel Salz, Mario Lucic, Michael Tschannen, Arsha Nagrani, Hexiang Hu, Mandar Joshi, Bo Pang, Ceslee Montgomery, Paulina Pietrzyk, Marvin Ritter, AJ Piergiovanni, Matthias Minderer, Filip Pavetic, Austin Waters, Gang Li, Ibrahim Alabdulmohsin, Lucas Beyer, Julien Amelot, Kenton Lee, Andreas Peter Steiner, Yang Li, Daniel Keysers, Anurag Arnab, Yuanzhong Xu, Keran Rong, Alexander Kolesnikov, Mojtaba Seyedhosseini, Anelia Angelova, Xiaohua Zhai, Neil Houlsby, Radu Soricut

Figure 1 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 2 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 3 for PaLI-X: On Scaling up a Multilingual Vision and Language Model
Figure 4 for PaLI-X: On Scaling up a Multilingual Vision and Language Model

We present the training recipe and results of scaling up PaLI-X, a multilingual vision and language model, both in terms of size of the components and the breadth of its training task mixture. Our model achieves new levels of performance on a wide-range of varied and complex tasks, including multiple image-based captioning and question-answering tasks, image-based document understanding and few-shot (in-context) learning, as well as object detection, video question answering, and video captioning. PaLI-X advances the state-of-the-art on most vision-and-language benchmarks considered (25+ of them). Finally, we observe emerging capabilities, such as complex counting and multilingual object detection, tasks that are not explicitly in the training mix.

Viaarxiv icon

Learning a Single Convolutional Layer Model for Low Light Image Enhancement

May 23, 2023
Yuantong Zhang, Baoxin Teng, Daiqin Yang, Zhenzhong Chen, Haichuan Ma, Gang Li, Wenpeng Ding

Figure 1 for Learning a Single Convolutional Layer Model for Low Light Image Enhancement
Figure 2 for Learning a Single Convolutional Layer Model for Low Light Image Enhancement
Figure 3 for Learning a Single Convolutional Layer Model for Low Light Image Enhancement
Figure 4 for Learning a Single Convolutional Layer Model for Low Light Image Enhancement

Low-light image enhancement (LLIE) aims to improve the illuminance of images due to insufficient light exposure. Recently, various lightweight learning-based LLIE methods have been proposed to handle the challenges of unfavorable prevailing low contrast, low brightness, etc. In this paper, we have streamlined the architecture of the network to the utmost degree. By utilizing the effective structural re-parameterization technique, a single convolutional layer model (SCLM) is proposed that provides global low-light enhancement as the coarsely enhanced results. In addition, we introduce a local adaptation module that learns a set of shared parameters to accomplish local illumination correction to address the issue of varied exposure levels in different image regions. Experimental results demonstrate that the proposed method performs favorably against the state-of-the-art LLIE methods in both objective metrics and subjective visual effects. Additionally, our method has fewer parameters and lower inference complexity compared to other learning-based schemes.

Viaarxiv icon