Alert button
Picture for Yi Chang

Yi Chang

Alert button

B-Spine: Learning B-Spline Curve Representation for Robust and Interpretable Spinal Curvature Estimation

Oct 14, 2023
Hao Wang, Qiang Song, Ruofeng Yin, Rui Ma, Yizhou Yu, Yi Chang

Spinal curvature estimation is important to the diagnosis and treatment of the scoliosis. Existing methods face several issues such as the need of expensive annotations on the vertebral landmarks and being sensitive to the image quality. It is challenging to achieve robust estimation and obtain interpretable results, especially for low-quality images which are blurry and hazy. In this paper, we propose B-Spine, a novel deep learning pipeline to learn B-spline curve representation of the spine and estimate the Cobb angles for spinal curvature estimation from low-quality X-ray images. Given a low-quality input, a novel SegRefine network which employs the unpaired image-to-image translation is proposed to generate a high quality spine mask from the initial segmentation result. Next, a novel mask-based B-spline prediction model is proposed to predict the B-spline curve for the spine centerline. Finally, the Cobb angles are estimated by a hybrid approach which combines the curve slope analysis and a curve-based regression model. We conduct quantitative and qualitative comparisons with the representative and SOTA learning-based methods on the public AASCE2019 dataset and our new proposed CJUH-JLU dataset which contains more challenging low-quality images. The superior performance on both datasets shows our method can achieve both robustness and interpretability for spinal curvature estimation.

Viaarxiv icon

Learning Generalizable Agents via Saliency-Guided Features Decorrelation

Oct 08, 2023
Sili Huang, Yanchao Sun, Jifeng Hu, Siyuan Guo, Hechang Chen, Yi Chang, Lichao Sun, Bo Yang

Figure 1 for Learning Generalizable Agents via Saliency-Guided Features Decorrelation
Figure 2 for Learning Generalizable Agents via Saliency-Guided Features Decorrelation
Figure 3 for Learning Generalizable Agents via Saliency-Guided Features Decorrelation
Figure 4 for Learning Generalizable Agents via Saliency-Guided Features Decorrelation

In visual-based Reinforcement Learning (RL), agents often struggle to generalize well to environmental variations in the state space that were not observed during training. The variations can arise in both task-irrelevant features, such as background noise, and task-relevant features, such as robot configurations, that are related to the optimal decisions. To achieve generalization in both situations, agents are required to accurately understand the impact of changed features on the decisions, i.e., establishing the true associations between changed features and decisions in the policy model. However, due to the inherent correlations among features in the state space, the associations between features and decisions become entangled, making it difficult for the policy to distinguish them. To this end, we propose Saliency-Guided Features Decorrelation (SGFD) to eliminate these correlations through sample reweighting. Concretely, SGFD consists of two core techniques: Random Fourier Functions (RFF) and the saliency map. RFF is utilized to estimate the complex non-linear correlations in high-dimensional images, while the saliency map is designed to identify the changed features. Under the guidance of the saliency map, SGFD employs sample reweighting to minimize the estimated correlations related to changed features, thereby achieving decorrelation in visual RL tasks. Our experimental results demonstrate that SGFD can generalize well on a wide range of test environments and significantly outperforms state-of-the-art methods in handling both task-irrelevant variations and task-relevant variations.

Viaarxiv icon

Careful at Estimation and Bold at Exploration

Aug 22, 2023
Xing Chen, Yijun Liu, Zhaogeng Liu, Hechang Chen, Hengshuai Yao, Yi Chang

Figure 1 for Careful at Estimation and Bold at Exploration
Figure 2 for Careful at Estimation and Bold at Exploration
Figure 3 for Careful at Estimation and Bold at Exploration
Figure 4 for Careful at Estimation and Bold at Exploration

Exploration strategies in continuous action space are often heuristic due to the infinite actions, and these kinds of methods cannot derive a general conclusion. In prior work, it has been shown that policy-based exploration is beneficial for continuous action space in deterministic policy reinforcement learning(DPRL). However, policy-based exploration in DPRL has two prominent issues: aimless exploration and policy divergence, and the policy gradient for exploration is only sometimes helpful due to inaccurate estimation. Based on the double-Q function framework, we introduce a novel exploration strategy to mitigate these issues, separate from the policy gradient. We first propose the greedy Q softmax update schema for Q value update. The expected Q value is derived by weighted summing the conservative Q value over actions, and the weight is the corresponding greedy Q value. Greedy Q takes the maximum value of the two Q functions, and conservative Q takes the minimum value of the two different Q functions. For practicality, this theoretical basis is then extended to allow us to combine action exploration with the Q value update, except for the premise that we have a surrogate policy that behaves like this exploration policy. In practice, we construct such an exploration policy with a few sampled actions, and to meet the premise, we learn such a surrogate policy by minimizing the KL divergence between the target policy and the exploration policy constructed by the conservative Q. We evaluate our method on the Mujoco benchmark and demonstrate superior performance compared to previous state-of-the-art methods across various environments, particularly in the most complex Humanoid environment.

* 20 pages 
Viaarxiv icon

From Sky to the Ground: A Large-scale Benchmark and Simple Baseline Towards Real Rain Removal

Aug 18, 2023
Yun Guo, Xueyao Xiao, Yi Chang, Shumin Deng, Luxin Yan

Figure 1 for From Sky to the Ground: A Large-scale Benchmark and Simple Baseline Towards Real Rain Removal
Figure 2 for From Sky to the Ground: A Large-scale Benchmark and Simple Baseline Towards Real Rain Removal
Figure 3 for From Sky to the Ground: A Large-scale Benchmark and Simple Baseline Towards Real Rain Removal
Figure 4 for From Sky to the Ground: A Large-scale Benchmark and Simple Baseline Towards Real Rain Removal

Learning-based image deraining methods have made great progress. However, the lack of large-scale high-quality paired training samples is the main bottleneck to hamper the real image deraining (RID). To address this dilemma and advance RID, we construct a Large-scale High-quality Paired real rain benchmark (LHP-Rain), including 3000 video sequences with 1 million high-resolution (1920*1080) frame pairs. The advantages of the proposed dataset over the existing ones are three-fold: rain with higher-diversity and larger-scale, image with higher-resolution and higher-quality ground-truth. Specifically, the real rains in LHP-Rain not only contain the classical rain streak/veiling/occlusion in the sky, but also the \textbf{splashing on the ground} overlooked by deraining community. Moreover, we propose a novel robust low-rank tensor recovery model to generate the GT with better separating the static background from the dynamic rain. In addition, we design a simple transformer-based single image deraining baseline, which simultaneously utilize the self-attention and cross-layer attention within the image and rain layer with discriminative feature representation. Extensive experiments verify the superiority of the proposed dataset and deraining method over state-of-the-art.

* Accepted by ICCV 2023 
Viaarxiv icon

Information Retrieval Meets Large Language Models: A Strategic Report from Chinese IR Community

Jul 27, 2023
Qingyao Ai, Ting Bai, Zhao Cao, Yi Chang, Jiawei Chen, Zhumin Chen, Zhiyong Cheng, Shoubin Dong, Zhicheng Dou, Fuli Feng, Shen Gao, Jiafeng Guo, Xiangnan He, Yanyan Lan, Chenliang Li, Yiqun Liu, Ziyu Lyu, Weizhi Ma, Jun Ma, Zhaochun Ren, Pengjie Ren, Zhiqiang Wang, Mingwen Wang, Ji-Rong Wen, Le Wu, Xin Xin, Jun Xu, Dawei Yin, Peng Zhang, Fan Zhang, Weinan Zhang, Min Zhang, Xiaofei Zhu

Figure 1 for Information Retrieval Meets Large Language Models: A Strategic Report from Chinese IR Community

The research field of Information Retrieval (IR) has evolved significantly, expanding beyond traditional search to meet diverse user information needs. Recently, Large Language Models (LLMs) have demonstrated exceptional capabilities in text understanding, generation, and knowledge inference, opening up exciting avenues for IR research. LLMs not only facilitate generative retrieval but also offer improved solutions for user understanding, model evaluation, and user-system interactions. More importantly, the synergistic relationship among IR models, LLMs, and humans forms a new technical paradigm that is more powerful for information seeking. IR models provide real-time and relevant information, LLMs contribute internal knowledge, and humans play a central role of demanders and evaluators to the reliability of information services. Nevertheless, significant challenges exist, including computational costs, credibility concerns, domain-specific limitations, and ethical considerations. To thoroughly discuss the transformative impact of LLMs on IR research, the Chinese IR community conducted a strategic workshop in April 2023, yielding valuable insights. This paper provides a summary of the workshop's outcomes, including the rethinking of IR's core values, the mutual enhancement of LLMs and IR, the proposal of a novel IR technical paradigm, and open challenges.

* 17 pages 
Viaarxiv icon

A Survey on Evaluation of Large Language Models

Jul 18, 2023
Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Kaijie Zhu, Hao Chen, Linyi Yang, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, Wei Ye, Yue Zhang, Yi Chang, Philip S. Yu, Qiang Yang, Xing Xie

Figure 1 for A Survey on Evaluation of Large Language Models
Figure 2 for A Survey on Evaluation of Large Language Models
Figure 3 for A Survey on Evaluation of Large Language Models
Figure 4 for A Survey on Evaluation of Large Language Models

Large language models (LLMs) are gaining increasing popularity in both academia and industry, owing to their unprecedented performance in various applications. As LLMs continue to play a vital role in both research and daily use, their evaluation becomes increasingly critical, not only at the task level, but also at the society level for better understanding of their potential risks. Over the past years, significant efforts have been made to examine LLMs from various perspectives. This paper presents a comprehensive review of these evaluation methods for LLMs, focusing on three key dimensions: what to evaluate, where to evaluate, and how to evaluate. Firstly, we provide an overview from the perspective of evaluation tasks, encompassing general natural language processing tasks, reasoning, medical usage, ethics, educations, natural and social sciences, agent applications, and other areas. Secondly, we answer the `where' and `how' questions by diving into the evaluation methods and benchmarks, which serve as crucial components in assessing performance of LLMs. Then, we summarize the success and failure cases of LLMs in different tasks. Finally, we shed light on several future challenges that lie ahead in LLMs evaluation. Our aim is to offer invaluable insights to researchers in the realm of LLMs evaluation, thereby aiding the development of more proficient LLMs. Our key point is that evaluation should be treated as an essential discipline to better assist the development of LLMs. We consistently maintain the related open-source materials at: https://github.com/MLGroupJLU/LLM-eval-survey.

* 25 pages; more work is at: https://llm-eval.github.io/ 
Viaarxiv icon

1st Solution Places for CVPR 2023 UG$^2$+ Challenge Track 2.2-Coded Target Restoration through Atmospheric Turbulence

Jun 15, 2023
Shengqi Xu, Shuning Cao, Haoyue Liu, Xueyao Xiao, Yi Chang, Luxin Yan

Figure 1 for 1st Solution Places for CVPR 2023 UG$^2$+ Challenge Track 2.2-Coded Target Restoration through Atmospheric Turbulence
Figure 2 for 1st Solution Places for CVPR 2023 UG$^2$+ Challenge Track 2.2-Coded Target Restoration through Atmospheric Turbulence
Figure 3 for 1st Solution Places for CVPR 2023 UG$^2$+ Challenge Track 2.2-Coded Target Restoration through Atmospheric Turbulence
Figure 4 for 1st Solution Places for CVPR 2023 UG$^2$+ Challenge Track 2.2-Coded Target Restoration through Atmospheric Turbulence

In this technical report, we briefly introduce the solution of our team VIELab-HUST for coded target restoration through atmospheric turbulence in CVPR 2023 UG$^2$+ Track 2.2. In this task, we propose an efficient multi-stage framework to restore a high quality image from distorted frames. Specifically, each distorted frame is initially aligned using image registration to suppress geometric distortion. We subsequently select the sharpest set of registered frames by employing a frame selection approach based on image sharpness, and average them to produce an image that is largely free of geometric distortion, albeit with blurriness. A learning-based deblurring method is then applied to remove the residual blur in the averaged image. Finally, post-processing techniques are utilized to further enhance the quality of the output image. Our framework is capable of handling different kinds of coded target dataset provided in the final testing phase, and ranked 1st on the final leaderboard. Our code will be available at https://github.com/xsqhust/Turbulence_Removal.

* arXiv admin note: text overlap with arXiv:2306.08963 
Viaarxiv icon