Alert button
Picture for Jun Liu

Jun Liu

Alert button

LMC: Large Model Collaboration with Cross-assessment for Training-Free Open-Set Object Recognition

Sep 22, 2023
Haoxuan Qu, Xiaofei Hui, Yujun Cai, Jun Liu

Open-set object recognition aims to identify if an object is from a class that has been encountered during training or not. To perform open-set object recognition accurately, a key challenge is how to reduce the reliance on spurious-discriminative features. In this paper, motivated by that different large models pre-trained through different paradigms can possess very rich while distinct implicit knowledge, we propose a novel framework named Large Model Collaboration (LMC) to tackle the above challenge via collaborating different off-the-shelf large models in a training-free manner. Moreover, we also incorporate the proposed framework with several novel designs to effectively extract implicit knowledge from large models. Extensive experiments demonstrate the efficacy of our proposed framework. Code is available \href{https://github.com/Harryqu123/LMC}{here}.

* NeurIPS 2023 
Viaarxiv icon

CloudBrain-NMR: An Intelligent Cloud Computing Platform for NMR Spectroscopy Processing, Reconstruction and Analysis

Sep 12, 2023
Di Guo, Sijin Li, Jun Liu, Zhangren Tu, Tianyu Qiu, Jingjing Xu, Liubin Feng, Donghai Lin, Qing Hong, Meijin Lin, Yanqin Lin, Xiaobo Qu

Nuclear Magnetic Resonance (NMR) spectroscopy has served as a powerful analytical tool for studying molecular structure and dynamics in chemistry and biology. However, the processing of raw data acquired from NMR spectrometers and subsequent quantitative analysis involves various specialized tools, which necessitates comprehensive knowledge in programming and NMR. Particularly, the emerging deep learning tools is hard to be widely used in NMR due to the sophisticated setup of computation. Thus, NMR processing is not an easy task for chemist and biologists. In this work, we present CloudBrain-NMR, an intelligent online cloud computing platform designed for NMR data reading, processing, reconstruction, and quantitative analysis. The platform is conveniently accessed through a web browser, eliminating the need for any program installation on the user side. CloudBrain-NMR uses parallel computing with graphics processing units and central processing units, resulting in significantly shortened computation time. Furthermore, it incorporates state-of-the-art deep learning-based algorithms offering comprehensive functionalities that allow users to complete the entire processing procedure without relying on additional software. This platform has empowered NMR applications with advanced artificial intelligence processing. CloudBrain-NMR is openly accessible for free usage at https://csrc.xmu.edu.cn/CloudBrain.html

* 11 pages, 13 figures 
Viaarxiv icon

Distribution-Aligned Diffusion for Human Mesh Recovery

Sep 11, 2023
Lin Geng Foo, Jia Gong, Hossein Rahmani, Jun Liu

Figure 1 for Distribution-Aligned Diffusion for Human Mesh Recovery
Figure 2 for Distribution-Aligned Diffusion for Human Mesh Recovery
Figure 3 for Distribution-Aligned Diffusion for Human Mesh Recovery
Figure 4 for Distribution-Aligned Diffusion for Human Mesh Recovery

Recovering a 3D human mesh from a single RGB image is a challenging task due to depth ambiguity and self-occlusion, resulting in a high degree of uncertainty. Meanwhile, diffusion models have recently seen much success in generating high-quality outputs by progressively denoising noisy inputs. Inspired by their capability, we explore a diffusion-based approach for human mesh recovery, and propose a Human Mesh Diffusion (HMDiff) framework which frames mesh recovery as a reverse diffusion process. We also propose a Distribution Alignment Technique (DAT) that infuses prior distribution information into the mesh distribution diffusion process, and provides useful prior knowledge to facilitate the mesh recovery task. Our method achieves state-of-the-art performance on three widely used datasets. Project page: https://gongjia0208.github.io/HMDiff/.

* Accepted to ICCV 2023 
Viaarxiv icon

AIGC for Various Data Modalities: A Survey

Sep 09, 2023
Lin Geng Foo, Hossein Rahmani, Jun Liu

Figure 1 for AIGC for Various Data Modalities: A Survey
Figure 2 for AIGC for Various Data Modalities: A Survey
Figure 3 for AIGC for Various Data Modalities: A Survey
Figure 4 for AIGC for Various Data Modalities: A Survey

AI-generated content (AIGC) methods aim to produce text, images, videos, 3D assets, and other media using AI algorithms. Due to its wide range of applications and the demonstrated potential of recent works, AIGC developments have been attracting lots of attention recently, and AIGC methods have been developed for various data modalities, such as image, video, text, 3D shape (as voxels, point clouds, meshes, and neural implicit fields), 3D scene, 3D human avatar (body and head), 3D motion, and audio -- each presenting different characteristics and challenges. Furthermore, there have also been many significant developments in cross-modality AIGC methods, where generative methods can receive conditioning input in one modality and produce outputs in another. Examples include going from various modalities to image, video, 3D shape, 3D scene, 3D avatar (body and head), 3D motion (skeleton and avatar), and audio modalities. In this paper, we provide a comprehensive review of AIGC methods across different data modalities, including both single-modality and cross-modality methods, highlighting the various challenges, representative works, and recent technical directions in each setting. We also survey the representative datasets throughout the modalities, and present comparative results for various modalities. Moreover, we also discuss the challenges and potential future research directions.

Viaarxiv icon

Deep Learning Overloaded Vehicle Identification for Long Span Bridges Based on Structural Health Monitoring Data

Sep 04, 2023
Yuqin Li, Jun Liu, Shengliang Zhong, Licheng Zhou, Shoubin Dong, Zejia Liu, Liqun Tang

Figure 1 for Deep Learning Overloaded Vehicle Identification for Long Span Bridges Based on Structural Health Monitoring Data
Figure 2 for Deep Learning Overloaded Vehicle Identification for Long Span Bridges Based on Structural Health Monitoring Data
Figure 3 for Deep Learning Overloaded Vehicle Identification for Long Span Bridges Based on Structural Health Monitoring Data
Figure 4 for Deep Learning Overloaded Vehicle Identification for Long Span Bridges Based on Structural Health Monitoring Data

Overloaded vehicles bring great harm to transportation infrastructures. BWIM (bridge weigh-in-motion) method for overloaded vehicle identification is getting more popular because it can be implemented without interruption to the traffic. However, its application is still limited because its effectiveness largely depends on professional knowledge and extra information, and is susceptible to occurrence of multiple vehicles. In this paper, a deep learning based overloaded vehicle identification approach (DOVI) is proposed, with the purpose of overloaded vehicle identification for long-span bridges by the use of structural health monitoring data. The proposed DOVI model uses temporal convolutional architectures to extract the spatial and temporal features of the input sequence data, thus provides an end-to-end overloaded vehicle identification solution which neither needs the influence line nor needs to obtain velocity and wheelbase information in advance and can be applied under the occurrence of multiple vehicles. Model evaluations are conducted on a simply supported beam and a long-span cable-stayed bridge under random traffic flow. Results demonstrate that the proposed deep-learning overloaded vehicle identification approach has better effectiveness and robustness, compared with other machine learning and deep learning approaches.

Viaarxiv icon

AI-generated Content for Various Data Modalities: A Survey

Sep 04, 2023
Lin Geng Foo, Hossein Rahmani, Jun Liu

Figure 1 for AI-generated Content for Various Data Modalities: A Survey
Figure 2 for AI-generated Content for Various Data Modalities: A Survey
Figure 3 for AI-generated Content for Various Data Modalities: A Survey
Figure 4 for AI-generated Content for Various Data Modalities: A Survey

AI-generated content (AIGC) methods aim to produce text, images, videos, 3D assets, and other media using AI algorithms. Due to its wide range of applications and the demonstrated potential of recent works, AIGC developments have been attracting lots of attention recently, and AIGC methods have been developed for various data modalities, such as image, video, text, 3D shape (as voxels, point clouds, meshes, and neural implicit fields), 3D scene, 3D human avatar (body and head), 3D motion, and audio -- each presenting different characteristics and challenges. Furthermore, there have also been many significant developments in cross-modality AIGC methods, where generative methods can receive conditioning input in one modality and produce outputs in another. Examples include going from various modalities to image, video, 3D shape, 3D scene, 3D avatar (body and head), 3D motion (skeleton and avatar), and audio modalities. In this paper, we provide a comprehensive review of AIGC methods across different data modalities, including both single-modality and cross-modality methods, highlighting the various challenges, representative works, and recent technical directions in each setting. We also survey the representative datasets throughout the modalities, and present comparative results for various modalities. Moreover, we also discuss the challenges and potential future research directions.

Viaarxiv icon

Radiology-Llama2: Best-in-Class Large Language Model for Radiology

Aug 29, 2023
Zhengliang Liu, Yiwei Li, Peng Shu, Aoxiao Zhong, Longtao Yang, Chao Ju, Zihao Wu, Chong Ma, Jie Luo, Cheng Chen, Sekeun Kim, Jiang Hu, Haixing Dai, Lin Zhao, Dajiang Zhu, Jun Liu, Wei Liu, Dinggang Shen, Tianming Liu, Quanzheng Li, Xiang Li

Figure 1 for Radiology-Llama2: Best-in-Class Large Language Model for Radiology
Figure 2 for Radiology-Llama2: Best-in-Class Large Language Model for Radiology
Figure 3 for Radiology-Llama2: Best-in-Class Large Language Model for Radiology
Figure 4 for Radiology-Llama2: Best-in-Class Large Language Model for Radiology

This paper introduces Radiology-Llama2, a large language model specialized for radiology through a process known as instruction tuning. Radiology-Llama2 is based on the Llama2 architecture and further trained on a large dataset of radiology reports to generate coherent and clinically useful impressions from radiological findings. Quantitative evaluations using ROUGE metrics on the MIMIC-CXR and OpenI datasets demonstrate that Radiology-Llama2 achieves state-of-the-art performance compared to other generative language models, with a Rouge-1 score of 0.4834 on MIMIC-CXR and 0.4185 on OpenI. Additional assessments by radiology experts highlight the model's strengths in understandability, coherence, relevance, conciseness, and clinical utility. The work illustrates the potential of localized language models designed and tuned for specialized domains like radiology. When properly evaluated and deployed, such models can transform fields like radiology by automating rote tasks and enhancing human expertise.

Viaarxiv icon

Unsupervised Domain Adaptation via Domain-Adaptive Diffusion

Aug 26, 2023
Duo Peng, Qiuhong Ke, Yinjie Lei, Jun Liu

Figure 1 for Unsupervised Domain Adaptation via Domain-Adaptive Diffusion
Figure 2 for Unsupervised Domain Adaptation via Domain-Adaptive Diffusion
Figure 3 for Unsupervised Domain Adaptation via Domain-Adaptive Diffusion
Figure 4 for Unsupervised Domain Adaptation via Domain-Adaptive Diffusion

Unsupervised Domain Adaptation (UDA) is quite challenging due to the large distribution discrepancy between the source domain and the target domain. Inspired by diffusion models which have strong capability to gradually convert data distributions across a large gap, we consider to explore the diffusion technique to handle the challenging UDA task. However, using diffusion models to convert data distribution across different domains is a non-trivial problem as the standard diffusion models generally perform conversion from the Gaussian distribution instead of from a specific domain distribution. Besides, during the conversion, the semantics of the source-domain data needs to be preserved for classification in the target domain. To tackle these problems, we propose a novel Domain-Adaptive Diffusion (DAD) module accompanied by a Mutual Learning Strategy (MLS), which can gradually convert data distribution from the source domain to the target domain while enabling the classification model to learn along the domain transition process. Consequently, our method successfully eases the challenge of UDA by decomposing the large domain gap into small ones and gradually enhancing the capacity of classification model to finally adapt to the target domain. Our method outperforms the current state-of-the-arts by a large margin on three widely used UDA datasets.

* 11 pages, 4 figures 
Viaarxiv icon