Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhiqiang Gao

Curriculum Learning for Biological Sequence Prediction: The Case of De Novo Peptide Sequencing

Jun 16, 2025

Xiang Zhang, Jiaqi Wei, Zijie Qiu, Sheng Xu, Nanqing Dong, Zhiqiang Gao, Siqi Sun

Abstract:Peptide sequencing-the process of identifying amino acid sequences from mass spectrometry data-is a fundamental task in proteomics. Non-Autoregressive Transformers (NATs) have proven highly effective for this task, outperforming traditional methods. Unlike autoregressive models, which generate tokens sequentially, NATs predict all positions simultaneously, leveraging bidirectional context through unmasked self-attention. However, existing NAT approaches often rely on Connectionist Temporal Classification (CTC) loss, which presents significant optimization challenges due to CTC's complexity and increases the risk of training failures. To address these issues, we propose an improved non-autoregressive peptide sequencing model that incorporates a structured protein sequence curriculum learning strategy. This approach adjusts protein's learning difficulty based on the model's estimated protein generational capabilities through a sampling process, progressively learning peptide generation from simple to complex sequences. Additionally, we introduce a self-refining inference-time module that iteratively enhances predictions using learned NAT token embeddings, improving sequence accuracy at a fine-grained level. Our curriculum learning strategy reduces NAT training failures frequency by more than 90% based on sampled training over various data distributions. Evaluations on nine benchmark species demonstrate that our approach outperforms all previous methods across multiple metrics and species.

Via

Access Paper or Ask Questions

PhySense: Principle-Based Physics Reasoning Benchmarking for Large Language Models

May 30, 2025

Yinggan Xu, Yue Liu, Zhiqiang Gao, Changnan Peng, Di Luo

Abstract:Large language models (LLMs) have rapidly advanced and are increasingly capable of tackling complex scientific problems, including those in physics. Despite this progress, current LLMs often fail to emulate the concise, principle-based reasoning characteristic of human experts, instead generating lengthy and opaque solutions. This discrepancy highlights a crucial gap in their ability to apply core physical principles for efficient and interpretable problem solving. To systematically investigate this limitation, we introduce PhySense, a novel principle-based physics reasoning benchmark designed to be easily solvable by experts using guiding principles, yet deceptively difficult for LLMs without principle-first reasoning. Our evaluation across multiple state-of-the-art LLMs and prompt types reveals a consistent failure to align with expert-like reasoning paths, providing insights for developing AI systems with efficient, robust and interpretable principle-based scientific reasoning.

Via

Access Paper or Ask Questions

Universal Biological Sequence Reranking for Improved De Novo Peptide Sequencing

May 23, 2025

Zijie Qiu, Jiaqi Wei, Xiang Zhang, Sheng Xu, Kai Zou, Zhi Jin, Zhiqiang Gao, Nanqing Dong, Siqi Sun

Abstract:De novo peptide sequencing is a critical task in proteomics. However, the performance of current deep learning-based methods is limited by the inherent complexity of mass spectrometry data and the heterogeneous distribution of noise signals, leading to data-specific biases. We present RankNovo, the first deep reranking framework that enhances de novo peptide sequencing by leveraging the complementary strengths of multiple sequencing models. RankNovo employs a list-wise reranking approach, modeling candidate peptides as multiple sequence alignments and utilizing axial attention to extract informative features across candidates. Additionally, we introduce two new metrics, PMD (Peptide Mass Deviation) and RMD (residual Mass Deviation), which offer delicate supervision by quantifying mass differences between peptides at both the sequence and residue levels. Extensive experiments demonstrate that RankNovo not only surpasses its base models used to generate training candidates for reranking pre-training, but also sets a new state-of-the-art benchmark. Moreover, RankNovo exhibits strong zero-shot generalization to unseen models whose generations were not exposed during training, highlighting its robustness and potential as a universal reranking framework for peptide sequencing. Our work presents a novel reranking strategy that fundamentally challenges existing single-model paradigms and advances the frontier of accurate de novo sequencing. Our source code is provided on GitHub.

Via

Access Paper or Ask Questions

Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation

May 09, 2025

Kunpeng Qiu, Zhiqiang Gao, Zhiying Zhou, Mingjie Sun, Yongxin Guo

Abstract:Deep learning has revolutionized medical image segmentation, yet its full potential remains constrained by the paucity of annotated datasets. While diffusion models have emerged as a promising approach for generating synthetic image-mask pairs to augment these datasets, they paradoxically suffer from the same data scarcity challenges they aim to mitigate. Traditional mask-only models frequently yield low-fidelity images due to their inability to adequately capture morphological intricacies, which can critically compromise the robustness and reliability of segmentation models. To alleviate this limitation, we introduce Siamese-Diffusion, a novel dual-component model comprising Mask-Diffusion and Image-Diffusion. During training, a Noise Consistency Loss is introduced between these components to enhance the morphological fidelity of Mask-Diffusion in the parameter space. During sampling, only Mask-Diffusion is used, ensuring diversity and scalability. Comprehensive experiments demonstrate the superiority of our method. Siamese-Diffusion boosts SANet's mDice and mIoU by 3.6% and 4.4% on the Polyps, while UNet improves by 1.52% and 1.64% on the ISIC2018. Code is available at GitHub.

* Accepted by CVPR2025

Via

Access Paper or Ask Questions

Discrete Wavelet Transform-Based Capsule Network for Hyperspectral Image Classification

Jan 08, 2025

Zhiqiang Gao, Jiaqi Wang, Hangchi Shen, Zhihao Dou, Xiangbo Zhang, Kaizhu Huang

Figure 1 for Discrete Wavelet Transform-Based Capsule Network for Hyperspectral Image Classification

Figure 2 for Discrete Wavelet Transform-Based Capsule Network for Hyperspectral Image Classification

Figure 3 for Discrete Wavelet Transform-Based Capsule Network for Hyperspectral Image Classification

Figure 4 for Discrete Wavelet Transform-Based Capsule Network for Hyperspectral Image Classification

Abstract:Hyperspectral image (HSI) classification is a crucial technique for remote sensing to build large-scale earth monitoring systems. HSI contains much more information than traditional visual images for identifying the categories of land covers. One recent feasible solution for HSI is to leverage CapsNets for capturing spectral-spatial information. However, these methods require high computational requirements due to the full connection architecture between stacked capsule layers. To solve this problem, a DWT-CapsNet is proposed to identify partial but important connections in CapsNet for a effective and efficient HSI classification. Specifically, we integrate a tailored attention mechanism into a Discrete Wavelet Transform (DWT)-based downsampling layer, alleviating the information loss problem of conventional downsampling operation in feature extractors. Moreover, we propose a novel multi-scale routing algorithm that prunes a large proportion of connections in CapsNet. A capsule pyramid fusion mechanism is designed to aggregate the spectral-spatial relationships in multiple levels of granularity, and then a self-attention mechanism is further conducted in a partially and locally connected architecture to emphasize the meaningful relationships. As shown in the experimental results, our method achieves state-of-the-art accuracy while keeping lower computational demand regarding running time, flops, and the number of parameters, rendering it an appealing choice for practical implementation in HSI classification.

* 28 Pages; 9 Figure

Via

Access Paper or Ask Questions

Unlocking the Non-Native Language Context Limitation: Native Language Prompting Facilitates Knowledge Elicitation

Aug 07, 2024

Baixuan Li, Yunlong Fan, Zhiqiang Gao

Figure 1 for Unlocking the Non-Native Language Context Limitation: Native Language Prompting Facilitates Knowledge Elicitation

Figure 2 for Unlocking the Non-Native Language Context Limitation: Native Language Prompting Facilitates Knowledge Elicitation

Figure 3 for Unlocking the Non-Native Language Context Limitation: Native Language Prompting Facilitates Knowledge Elicitation

Figure 4 for Unlocking the Non-Native Language Context Limitation: Native Language Prompting Facilitates Knowledge Elicitation

Abstract:Multilingual large language models (MLLMs) struggle to answer questions posed in non-dominant languages, even though they have already acquired the relevant knowledge from their dominant language corpus. In contrast, human multilinguals can overcome this issue by invoking the relatively rich knowledge acquired from native language texts through Positive Native Language Transfer (PNLT). Inspired by this, we analogize the dominant language of MLLMs to the native language of human multilinguals, and propose Native Language Prompting (NatLan) to simulate the PNLT observed in human multilinguals. It explicitly creates native language contexts for MLLMs to facilitate the elicitation of the rich native language knowledge during question-answering, unlocking the limitations imposed by non-native language contexts on the effective application of knowledge. By employing multi-MLLM collaboration, NatLan reduces the workload on each MLLM in simulating PNLT and refines semantic transfer. On the C-Eval benchmark, NatLan provides up to a 10.1% average accuracy improvement and up to a 5.0% increase in the hard-level subset across five MLLMs, surpassing all top-notch related methods. Our code is available at https://github.com/AnonyNLP/NatLan.

Via

Access Paper or Ask Questions

ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide Sequencing

Dec 18, 2023

Zhi Jin, Sheng Xu, Xiang Zhang, Tianze Ling, Nanqing Dong, Wanli Ouyang, Zhiqiang Gao, Cheng Chang, Siqi Sun

Figure 1 for ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide Sequencing

Figure 2 for ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide Sequencing

Figure 3 for ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide Sequencing

Figure 4 for ContraNovo: A Contrastive Learning Approach to Enhance De Novo Peptide Sequencing

Abstract:De novo peptide sequencing from mass spectrometry (MS) data is a critical task in proteomics research. Traditional de novo algorithms have encountered a bottleneck in accuracy due to the inherent complexity of proteomics data. While deep learning-based methods have shown progress, they reduce the problem to a translation task, potentially overlooking critical nuances between spectra and peptides. In our research, we present ContraNovo, a pioneering algorithm that leverages contrastive learning to extract the relationship between spectra and peptides and incorporates the mass information into peptide decoding, aiming to address these intricacies more efficiently. Through rigorous evaluations on two benchmark datasets, ContraNovo consistently outshines contemporary state-of-the-art solutions, underscoring its promising potential in enhancing de novo peptide sequencing. The source code is available at https://github.com/BEAM-Labs/ContraNovo.

* This paper has been accepted by AAAI 2024

Via

Access Paper or Ask Questions

A Higher-Order Semantic Dependency Parser

Jan 27, 2022

Bin Li, Yunlong Fan, Yikemaiti Sataer, Zhiqiang Gao

Figure 1 for A Higher-Order Semantic Dependency Parser

Figure 2 for A Higher-Order Semantic Dependency Parser

Figure 3 for A Higher-Order Semantic Dependency Parser

Figure 4 for A Higher-Order Semantic Dependency Parser

Abstract:Higher-order features bring significant accuracy gains in semantic dependency parsing. However, modeling higher-order features with exact inference is NP-hard. Graph neural networks (GNNs) have been demonstrated to be an effective tool for solving NP-hard problems with approximate inference in many graph learning tasks. Inspired by the success of GNNs, we investigate building a higher-order semantic dependency parser by applying GNNs. Instead of explicitly extracting higher-order features from intermediate parsing graphs, GNNs aggregate higher-order information concisely by stacking multiple GNN layers. Experimental results show that our model outperforms the previous state-of-the-art parser on the SemEval 2015 Task 18 English datasets.

Via

Access Paper or Ask Questions

An AI-based, Multi-stage detection system of banking botnets

Jul 25, 2019

Li Ling, Zhiqiang Gao, Michael A Silas, Ian Lee, Erwan A Le Doeuff

Figure 1 for An AI-based, Multi-stage detection system of banking botnets

Figure 2 for An AI-based, Multi-stage detection system of banking botnets

Figure 3 for An AI-based, Multi-stage detection system of banking botnets

Figure 4 for An AI-based, Multi-stage detection system of banking botnets

Abstract:Banking Trojans, botnets are primary drivers of financially-motivated cybercrime. In this paper, we first analyzed how an APT-based banking botnet works step by step through the whole lifecycle. Specifically, we present a multi-stage system that detects malicious banking botnet activities which potentially target the organizations. The system leverages Cyber Data Lake as well as multiple artificial intelligence techniques at different stages. The evaluation results using public datasets showed that Deep Learning based detections were highly successful compared with baseline models.

* FEAP-AI4Fin 2018 : NIPS 2018 Workshop on Challenges and Opportunities for AI in Financial Services: the Impact of Fairness, Explainability, Accuracy, and Privacy

Via

Access Paper or Ask Questions