Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jingjing Guo

Breaking Obfuscation: Cluster-Aware Graph with LLM-Aided Recovery for Malicious JavaScript Detection

Jul 30, 2025

Zhihong Liang, Xin Wang, Zhenhuang Hu, Liangliang Song, Lin Chen, Jingjing Guo, Yanbin Wang, Ye Tian

Abstract:With the rapid expansion of web-based applications and cloud services, malicious JavaScript code continues to pose significant threats to user privacy, system integrity, and enterprise security. But, detecting such threats remains challenging due to sophisticated code obfuscation techniques and JavaScript's inherent language characteristics, particularly its nested closure structures and syntactic flexibility. In this work, we propose DeCoda, a hybrid defense framework that combines large language model (LLM)-based deobfuscation with code graph learning: (1) We first construct a sophisticated prompt-learning pipeline with multi-stage refinement, where the LLM progressively reconstructs the original code structure from obfuscated inputs and then generates normalized Abstract Syntax Tree (AST) representations; (2) In JavaScript ASTs, dynamic typing scatters semantically similar nodes while deeply nested functions fracture scope capturing, introducing structural noise and semantic ambiguity. To address these challenges, we then propose to learn hierarchical code graph representations via a Cluster-wise Graph that synergistically integrates graph transformer network, node clustering, and node-to-cluster attention to simultaneously capture both local node-level semantics and global cluster-induced structural relationships from AST graph. Experimental results demonstrate that our method achieves F1-scores of 94.64% and 97.71% on two benchmark datasets, demonstrating absolute improvements of 10.74% and 13.85% over state-of-the-art baselines. In false-positive control evaluation at fixed FPR levels (0.0001, 0.001, 0.01), our approach delivers 4.82, 5.91, and 2.53 higher TPR respectively compared to the best-performing baseline. These results highlight the effectiveness of LLM-based deobfuscation and underscore the importance of modeling cluster-level relationships in detecting malicious code.

Via

Access Paper or Ask Questions

MSSC-BiMamba: Multimodal Sleep Stage Classification and Early Diagnosis of Sleep Disorders with Bidirectional Mamba

May 31, 2024

Chao Zhang, Weirong Cui, Jingjing Guo

Figure 1 for MSSC-BiMamba: Multimodal Sleep Stage Classification and Early Diagnosis of Sleep Disorders with Bidirectional Mamba

Figure 2 for MSSC-BiMamba: Multimodal Sleep Stage Classification and Early Diagnosis of Sleep Disorders with Bidirectional Mamba

Figure 3 for MSSC-BiMamba: Multimodal Sleep Stage Classification and Early Diagnosis of Sleep Disorders with Bidirectional Mamba

Figure 4 for MSSC-BiMamba: Multimodal Sleep Stage Classification and Early Diagnosis of Sleep Disorders with Bidirectional Mamba

Abstract:Monitoring sleep states is essential for evaluating sleep quality and diagnosing sleep disorders. Traditional manual staging is time-consuming and prone to subjective bias, often resulting in inconsistent outcomes. Here, we developed an automated model for sleep staging and disorder classification to enhance diagnostic accuracy and efficiency. Considering the characteristics of polysomnography (PSG) multi-lead sleep monitoring, we designed a multimodal sleep state classification model, MSSC-BiMamba, that combines an Efficient Channel Attention (ECA) mechanism with a Bidirectional State Space Model (BSSM). The ECA module allows for weighting data from different sensor channels, thereby amplifying the influence of diverse sensor inputs. Additionally, the implementation of bidirectional Mamba (BiMamba) enables the model to effectively capture the multidimensional features and long-range dependencies of PSG data. The developed model demonstrated impressive performance on sleep stage classification tasks on both the ISRUC-S3 and ISRUC-S1 datasets, respectively containing data with healthy and unhealthy sleep patterns. Also, the model exhibited a high accuracy for sleep health prediction when evaluated on a combined dataset consisting of ISRUC and Sleep-EDF. Our model, which can effectively handle diverse sleep conditions, is the first to apply BiMamba to sleep staging with multimodal PSG data, showing substantial gains in computational and memory efficiency over traditional Transformer-style models. This method enhances sleep health management by making monitoring more accessible and extending advanced healthcare through innovative technology.

* 10 pages

Via

Access Paper or Ask Questions

Gated Deeper Models are Effective Factor Learners

May 18, 2023

Jingjing Guo

Abstract:Precisely forecasting the excess returns of an asset (e.g., Tesla stock) is beneficial to all investors. However, the unpredictability of market dynamics, influenced by human behaviors, makes this a challenging task. In prior research, researcher have manually crafted among of factors as signals to guide their investing process. In contrast, this paper view this problem in a different perspective that we align deep learning model to combine those human designed factors to predict the trend of excess returns. To this end, we present a 5-layer deep neural network that generates more meaningful factors in a 2048-dimensional space. Modern network design techniques are utilized to enhance robustness training and reduce overfitting. Additionally, we propose a gated network that dynamically filters out noise-learned features, resulting in improved performance. We evaluate our model over 2,000 stocks from the China market with their recent three years records. The experimental results show that the proposed gated activation layer and the deep neural network could effectively overcome the problem. Specifically, the proposed gated activation layer and deep neural network contribute to the superior performance of our model. In summary, the proposed model exhibits promising results and could potentially benefit investors seeking to optimize their investment strategies.

* 7 pages

Via

Access Paper or Ask Questions