De novo peptide sequencing from mass spectrometry (MS) data is a critical task in proteomics research. Traditional de novo algorithms have encountered a bottleneck in accuracy due to the inherent complexity of proteomics data. While deep learning-based methods have shown progress, they reduce the problem to a translation task, potentially overlooking critical nuances between spectra and peptides. In our research, we present ContraNovo, a pioneering algorithm that leverages contrastive learning to extract the relationship between spectra and peptides and incorporates the mass information into peptide decoding, aiming to address these intricacies more efficiently. Through rigorous evaluations on two benchmark datasets, ContraNovo consistently outshines contemporary state-of-the-art solutions, underscoring its promising potential in enhancing de novo peptide sequencing. The source code is available at https://github.com/BEAM-Labs/ContraNovo.
Higher-order features bring significant accuracy gains in semantic dependency parsing. However, modeling higher-order features with exact inference is NP-hard. Graph neural networks (GNNs) have been demonstrated to be an effective tool for solving NP-hard problems with approximate inference in many graph learning tasks. Inspired by the success of GNNs, we investigate building a higher-order semantic dependency parser by applying GNNs. Instead of explicitly extracting higher-order features from intermediate parsing graphs, GNNs aggregate higher-order information concisely by stacking multiple GNN layers. Experimental results show that our model outperforms the previous state-of-the-art parser on the SemEval 2015 Task 18 English datasets.
Banking Trojans, botnets are primary drivers of financially-motivated cybercrime. In this paper, we first analyzed how an APT-based banking botnet works step by step through the whole lifecycle. Specifically, we present a multi-stage system that detects malicious banking botnet activities which potentially target the organizations. The system leverages Cyber Data Lake as well as multiple artificial intelligence techniques at different stages. The evaluation results using public datasets showed that Deep Learning based detections were highly successful compared with baseline models.