Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jun Zhou

Yilin

Improving Natural Language Understanding for LLMs via Large-Scale Instruction Synthesis

Feb 06, 2025

Lin Yuan, Jun Xu, Honghao Gui, Mengshu Sun, Zhiqiang Zhang, Lei Liang, Jun Zhou

Figure 1 for Improving Natural Language Understanding for LLMs via Large-Scale Instruction Synthesis

Figure 2 for Improving Natural Language Understanding for LLMs via Large-Scale Instruction Synthesis

Figure 3 for Improving Natural Language Understanding for LLMs via Large-Scale Instruction Synthesis

Figure 4 for Improving Natural Language Understanding for LLMs via Large-Scale Instruction Synthesis

Abstract:High-quality, large-scale instructions are crucial for aligning large language models (LLMs), however, there is a severe shortage of instruction in the field of natural language understanding (NLU). Previous works on constructing NLU instructions mainly focus on information extraction (IE), neglecting tasks such as machine reading comprehension, question answering, and text classification. Furthermore, the lack of diversity in the data has led to a decreased generalization ability of trained LLMs in other NLU tasks and a noticeable decline in the fundamental model's general capabilities. To address this issue, we propose Hum, a large-scale, high-quality synthetic instruction corpus for NLU tasks, designed to enhance the NLU capabilities of LLMs. Specifically, Hum includes IE (either close IE or open IE), machine reading comprehension, text classification, and instruction generalist tasks, thereby enriching task diversity. Additionally, we introduce a human-LLMs collaborative mechanism to synthesize instructions, which enriches instruction diversity by incorporating guidelines, preference rules, and format variants. We conduct extensive experiments on 5 NLU tasks and 28 general capability evaluation datasets for LLMs. Experimental results show that Hum enhances the NLU capabilities of six LLMs by an average of 3.1\%, with no significant decline observed in other general capabilities.

* Accepted by AAAI 2025

Via

Access Paper or Ask Questions

MAQInstruct: Instruction-based Unified Event Relation Extraction

Feb 06, 2025

Jun Xu, Mengshu Sun, Zhiqiang Zhang, Jun Zhou

Abstract:Extracting event relations that deviate from known schemas has proven challenging for previous methods based on multi-class classification, MASK prediction, or prototype matching. Recent advancements in large language models have shown impressive performance through instruction tuning. Nevertheless, in the task of event relation extraction, instruction-based methods face several challenges: there are a vast number of inference samples, and the relations between events are non-sequential. To tackle these challenges, we present an improved instruction-based event relation extraction framework named MAQInstruct. Firstly, we transform the task from extracting event relations using given event-event instructions to selecting events using given event-relation instructions, which reduces the number of samples required for inference. Then, by incorporating a bipartite matching loss, we reduce the dependency of the instruction-based method on the generation sequence. Our experimental results demonstrate that MAQInstruct significantly improves the performance of event relation extraction across multiple LLMs.

* Accepted by WWW 2025 short

Via

Access Paper or Ask Questions

Determining Mosaic Resilience in Sugarcane Plants using Hyperspectral Images

Jan 28, 2025

Ali Zia, Jun Zhou, Muyiwa Olayemi

Abstract:Sugarcane mosaic disease poses a serious threat to the Australian sugarcane industry, leading to yield losses of up to 30% in susceptible varieties. Existing manual inspection methods for detecting mosaic resilience are inefficient and impractical for large-scale application. This study introduces a novel approach using hyperspectral imaging and machine learning to detect mosaic resilience by leveraging global feature representation from local spectral patches. Hyperspectral data were collected from eight sugarcane varieties under controlled and field conditions. Local spectral patches were analyzed to capture spatial and spectral variations, which were then aggregated into global feature representations using a ResNet18 deep learning architecture. While classical methods like Support Vector Machines struggled to utilize spatial-spectral relationships effectively, the deep learning model achieved high classification accuracy, demonstrating its capacity to identify mosaic resilience from fine-grained hyperspectral data. This approach enhances early detection capabilities, enabling more efficient management of susceptible strains and contributing to sustainable sugarcane production.

Via

Access Paper or Ask Questions

Are Traditional Deep Learning Model Approaches as Effective as a Retinal-Specific Foundation Model for Ocular and Systemic Disease Detection?

Jan 21, 2025

Samantha Min Er Yew, Xiaofeng Lei, Jocelyn Hui Lin Goh, Yibing Chen, Sahana Srinivasan, Miao-li Chee, Krithi Pushpanathan, Ke Zou, Qingshan Hou, Zhi Da Soh(+19 more)

Abstract:Background: RETFound, a self-supervised, retina-specific foundation model (FM), showed potential in downstream applications. However, its comparative performance with traditional deep learning (DL) models remains incompletely understood. This study aimed to evaluate RETFound against three ImageNet-pretrained supervised DL models (ResNet50, ViT-base, SwinV2) in detecting ocular and systemic diseases. Methods: We fine-tuned/trained RETFound and three DL models on full datasets, 50%, 20%, and fixed sample sizes (400, 200, 100 images, with half comprising disease cases; for each DR severity class, 100 and 50 cases were used. Fine-tuned models were tested internally using the SEED (53,090 images) and APTOS-2019 (3,672 images) datasets and externally validated on population-based (BES, CIEMS, SP2, UKBB) and open-source datasets (ODIR-5k, PAPILA, GAMMA, IDRiD, MESSIDOR-2). Model performance was compared using area under the receiver operating characteristic curve (AUC) and Z-tests with Bonferroni correction (P<0.05/3). Interpretation: Traditional DL models are mostly comparable to RETFound for ocular disease detection with large datasets. However, RETFound is superior in systemic disease detection with smaller datasets. These findings offer valuable insights into the respective merits and limitation of traditional models and FMs.

Via

Access Paper or Ask Questions

STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

Jan 06, 2025

Rui Xie, Yinhong Liu, Penghao Zhou, Chen Zhao, Jun Zhou, Kai Zhang, Zhenyu Zhang, Jian Yang, Zhenheng Yang, Ying Tai

Abstract:Image diffusion models have been adapted for real-world video super-resolution to tackle over-smoothing issues in GAN-based methods. However, these models struggle to maintain temporal consistency, as they are trained on static images, limiting their ability to capture temporal dynamics effectively. Integrating text-to-video (T2V) models into video super-resolution for improved temporal modeling is straightforward. However, two key challenges remain: artifacts introduced by complex degradations in real-world scenarios, and compromised fidelity due to the strong generative capacity of powerful T2V models (\textit{e.g.}, CogVideoX-5B). To enhance the spatio-temporal quality of restored videos, we introduce\textbf{~\name} (\textbf{S}patial-\textbf{T}emporal \textbf{A}ugmentation with T2V models for \textbf{R}eal-world video super-resolution), a novel approach that leverages T2V models for real-world video super-resolution, achieving realistic spatial details and robust temporal consistency. Specifically, we introduce a Local Information Enhancement Module (LIEM) before the global attention block to enrich local details and mitigate degradation artifacts. Moreover, we propose a Dynamic Frequency (DF) Loss to reinforce fidelity, guiding the model to focus on different frequency components across diffusion steps. Extensive experiments demonstrate\textbf{~\name}~outperforms state-of-the-art methods on both synthetic and real-world datasets.

Via

Access Paper or Ask Questions

From Aleatoric to Epistemic: Exploring Uncertainty Quantification Techniques in Artificial Intelligence

Jan 05, 2025

Tianyang Wang, Yunze Wang, Jun Zhou, Benji Peng, Xinyuan Song, Charles Zhang, Xintian Sun, Qian Niu, Junyu Liu, Silin Chen(+9 more)

Figure 1 for From Aleatoric to Epistemic: Exploring Uncertainty Quantification Techniques in Artificial Intelligence

Figure 2 for From Aleatoric to Epistemic: Exploring Uncertainty Quantification Techniques in Artificial Intelligence

Figure 3 for From Aleatoric to Epistemic: Exploring Uncertainty Quantification Techniques in Artificial Intelligence

Figure 4 for From Aleatoric to Epistemic: Exploring Uncertainty Quantification Techniques in Artificial Intelligence

Abstract:Uncertainty quantification (UQ) is a critical aspect of artificial intelligence (AI) systems, particularly in high-risk domains such as healthcare, autonomous systems, and financial technology, where decision-making processes must account for uncertainty. This review explores the evolution of uncertainty quantification techniques in AI, distinguishing between aleatoric and epistemic uncertainties, and discusses the mathematical foundations and methods used to quantify these uncertainties. We provide an overview of advanced techniques, including probabilistic methods, ensemble learning, sampling-based approaches, and generative models, while also highlighting hybrid approaches that integrate domain-specific knowledge. Furthermore, we examine the diverse applications of UQ across various fields, emphasizing its impact on decision-making, predictive accuracy, and system robustness. The review also addresses key challenges such as scalability, efficiency, and integration with explainable AI, and outlines future directions for research in this rapidly developing area. Through this comprehensive survey, we aim to provide a deeper understanding of UQ's role in enhancing the reliability, safety, and trustworthiness of AI systems.

* 14 pages

Via

Access Paper or Ask Questions

A Separable Self-attention Inspired by the State Space Model for Computer Vision

Jan 03, 2025

Juntao Zhang, Shaogeng Liu, Kun Bian, You Zhou, Pei Zhang, Jianning Liu, Jun Zhou, Bingyan Liu

Figure 1 for A Separable Self-attention Inspired by the State Space Model for Computer Vision

Figure 2 for A Separable Self-attention Inspired by the State Space Model for Computer Vision

Figure 3 for A Separable Self-attention Inspired by the State Space Model for Computer Vision

Figure 4 for A Separable Self-attention Inspired by the State Space Model for Computer Vision

Abstract:Mamba is an efficient State Space Model (SSM) with linear computational complexity. Although SSMs are not suitable for handling non-causal data, Vision Mamba (ViM) methods still demonstrate good performance in tasks such as image classification and object detection. Recent studies have shown that there is a rich theoretical connection between state space models and attention variants. We propose a novel separable self attention method, for the first time introducing some excellent design concepts of Mamba into separable self-attention. To ensure a fair comparison with ViMs, we introduce VMINet, a simple yet powerful prototype architecture, constructed solely by stacking our novel attention modules with the most basic down-sampling layers. Notably, VMINet differs significantly from the conventional Transformer architecture. Our experiments demonstrate that VMINet has achieved competitive results on image classification and high-resolution dense prediction tasks.Code is available at: \url{https://github.com/yws-wxs/VMINet}.

Via

Access Paper or Ask Questions

OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

Dec 28, 2024

Yujie Luo, Xiangyuan Ru, Kangwei Liu, Lin Yuan, Mengshu Sun, Ningyu Zhang, Lei Liang, Zhiqiang Zhang, Jun Zhou, Lanning Wei(+3 more)

Figure 1 for OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

Figure 2 for OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

Figure 3 for OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System

Abstract:We introduce OneKE, a dockerized schema-guided knowledge extraction system, which can extract knowledge from the Web and raw PDF Books, and support various domains (science, news, etc.). Specifically, we design OneKE with multiple agents and a configure knowledge base. Different agents perform their respective roles, enabling support for various extraction scenarios. The configure knowledge base facilitates schema configuration, error case debugging and correction, further improving the performance. Empirical evaluations on benchmark datasets demonstrate OneKE's efficacy, while case studies further elucidate its adaptability to diverse tasks across multiple domains, highlighting its potential for broad applications. We have open-sourced the Code at https://github.com/zjunlp/OneKE and released a Video at http://oneke.openkg.cn/demo.mp4.

* Work in progress

Via

Access Paper or Ask Questions

Learning Causal Transition Matrix for Instance-dependent Label Noise

Dec 18, 2024

Jiahui Li, Tai-Wei Chang, Kun Kuang, Ximing Li, Long Chen, Jun Zhou

Figure 1 for Learning Causal Transition Matrix for Instance-dependent Label Noise

Figure 2 for Learning Causal Transition Matrix for Instance-dependent Label Noise

Figure 3 for Learning Causal Transition Matrix for Instance-dependent Label Noise

Figure 4 for Learning Causal Transition Matrix for Instance-dependent Label Noise

Abstract:Noisy labels are both inevitable and problematic in machine learning methods, as they negatively impact models' generalization ability by causing overfitting. In the context of learning with noise, the transition matrix plays a crucial role in the design of statistically consistent algorithms. However, the transition matrix is often considered unidentifiable. One strand of methods typically addresses this problem by assuming that the transition matrix is instance-independent; that is, the probability of mislabeling a particular instance is not influenced by its characteristics or attributes. This assumption is clearly invalid in complex real-world scenarios. To better understand the transition relationship and relax this assumption, we propose to study the data generation process of noisy labels from a causal perspective. We discover that an unobservable latent variable can affect either the instance itself, the label annotation procedure, or both, which complicates the identification of the transition matrix. To address various scenarios, we have unified these observations within a new causal graph. In this graph, the input instance is divided into a noise-resistant component and a noise-sensitive component based on whether they are affected by the latent variable. These two components contribute to identifying the ``causal transition matrix'', which approximates the true transition matrix with theoretical guarantee. In line with this, we have designed a novel training framework that explicitly models this causal relationship and, as a result, achieves a more accurate model for inferring the clean label.

Via

Access Paper or Ask Questions

RemoteTrimmer: Adaptive Structural Pruning for Remote Sensing Image Classification

Dec 17, 2024

Guanwenjie Zou, Liang Yao, Fan Liu, Chuanyi Zhang, Xin Li, Ning Chen, Shengxiang Xu, Jun Zhou

Figure 1 for RemoteTrimmer: Adaptive Structural Pruning for Remote Sensing Image Classification

Figure 2 for RemoteTrimmer: Adaptive Structural Pruning for Remote Sensing Image Classification

Figure 3 for RemoteTrimmer: Adaptive Structural Pruning for Remote Sensing Image Classification

Figure 4 for RemoteTrimmer: Adaptive Structural Pruning for Remote Sensing Image Classification

Abstract:Since high resolution remote sensing image classification often requires a relatively high computation complexity, lightweight models tend to be practical and efficient. Model pruning is an effective method for model compression. However, existing methods rarely take into account the specificity of remote sensing images, resulting in significant accuracy loss after pruning. To this end, we propose an effective structural pruning approach for remote sensing image classification. Specifically, a pruning strategy that amplifies the differences in channel importance of the model is introduced. Then an adaptive mining loss function is designed for the fine-tuning process of the pruned model. Finally, we conducted experiments on two remote sensing classification datasets. The experimental results demonstrate that our method achieves minimal accuracy loss after compressing remote sensing classification models, achieving state-of-the-art (SoTA) performance.

Via

Access Paper or Ask Questions