Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xusheng Xiao

BinCtx: Multi-Modal Representation Learning for Robust Android App Behavior Detection

Oct 16, 2025

Zichen Liu, Shao Yang, Xusheng Xiao

Abstract:Mobile app markets host millions of apps, yet undesired behaviors (e.g., disruptive ads, illegal redirection, payment deception) remain hard to catch because they often do not rely on permission-protected APIs and can be easily camouflaged via UI or metadata edits. We present BINCTX, a learning approach that builds multi-modal representations of an app from (i) a global bytecode-as-image view that captures code-level semantics and family-style patterns, (ii) a contextual view (manifested actions, components, declared permissions, URL/IP constants) indicating how behaviors are triggered, and (iii) a third-party-library usage view summarizing invocation frequencies along inter-component call paths. The three views are embedded and fused to train a contextual-aware classifier. On real-world malware and benign apps, BINCTX attains a macro F1 of 94.73%, outperforming strong baselines by at least 14.92%. It remains robust under commercial obfuscation (F1 84% post-obfuscation) and is more resistant to adversarial samples than state-of-the-art bytecode-only systems.

Via

Access Paper or Ask Questions

On the Security Risks of Knowledge Graph Reasoning

May 03, 2023

Zhaohan Xi, Tianyu Du, Changjiang Li, Ren Pang, Shouling Ji, Xiapu Luo, Xusheng Xiao, Fenglong Ma, Ting Wang

Figure 1 for On the Security Risks of Knowledge Graph Reasoning

Figure 2 for On the Security Risks of Knowledge Graph Reasoning

Figure 3 for On the Security Risks of Knowledge Graph Reasoning

Figure 4 for On the Security Risks of Knowledge Graph Reasoning

Abstract:Knowledge graph reasoning (KGR) -- answering complex logical queries over large knowledge graphs -- represents an important artificial intelligence task, entailing a range of applications (e.g., cyber threat hunting). However, despite its surging popularity, the potential security risks of KGR are largely unexplored, which is concerning, given the increasing use of such capability in security-critical domains. This work represents a solid initial step towards bridging the striking gap. We systematize the security threats to KGR according to the adversary's objectives, knowledge, and attack vectors. Further, we present ROAR, a new class of attacks that instantiate a variety of such threats. Through empirical evaluation in representative use cases (e.g., medical decision support, cyber threat hunting, and commonsense reasoning), we demonstrate that ROAR is highly effective to mislead KGR to suggest pre-defined answers for target queries, yet with negligible impact on non-target ones. Finally, we explore potential countermeasures against ROAR, including filtering of potentially poisoning knowledge and training with adversarially augmented queries, which leads to several promising research directions.

* In proceedings of USENIX Security'23. Codes: https://github.com/HarrialX/security-risk-KG-reasoning

Via

Access Paper or Ask Questions

A System for Efficiently Hunting for Cyber Threats in Computer Systems Using Threat Intelligence

Jan 17, 2021

Peng Gao, Fei Shao, Xiaoyuan Liu, Xusheng Xiao, Haoyuan Liu, Zheng Qin, Fengyuan Xu, Prateek Mittal, Sanjeev R. Kulkarni, Dawn Song

Figure 1 for A System for Efficiently Hunting for Cyber Threats in Computer Systems Using Threat Intelligence

Figure 2 for A System for Efficiently Hunting for Cyber Threats in Computer Systems Using Threat Intelligence

Figure 3 for A System for Efficiently Hunting for Cyber Threats in Computer Systems Using Threat Intelligence

Abstract:Log-based cyber threat hunting has emerged as an important solution to counter sophisticated cyber attacks. However, existing approaches require non-trivial efforts of manual query construction and have overlooked the rich external knowledge about threat behaviors provided by open-source Cyber Threat Intelligence (OSCTI). To bridge the gap, we build ThreatRaptor, a system that facilitates cyber threat hunting in computer systems using OSCTI. Built upon mature system auditing frameworks, ThreatRaptor provides (1) an unsupervised, light-weight, and accurate NLP pipeline that extracts structured threat behaviors from unstructured OSCTI text, (2) a concise and expressive domain-specific query language, TBQL, to hunt for malicious system activities, (3) a query synthesis mechanism that automatically synthesizes a TBQL query from the extracted threat behaviors, and (4) an efficient query execution engine to search the big system audit logging data.

* Accepted paper at ICDE 2021 demonstrations track. arXiv admin note: substantial text overlap with arXiv:2010.13637

Via

Access Paper or Ask Questions

Enabling Efficient Cyber Threat Hunting With Cyber Threat Intelligence

Oct 26, 2020

Peng Gao, Fei Shao, Xiaoyuan Liu, Xusheng Xiao, Zheng Qin, Fengyuan Xu, Prateek Mittal, Sanjeev R. Kulkarni, Dawn Song

Figure 1 for Enabling Efficient Cyber Threat Hunting With Cyber Threat Intelligence

Figure 2 for Enabling Efficient Cyber Threat Hunting With Cyber Threat Intelligence

Figure 3 for Enabling Efficient Cyber Threat Hunting With Cyber Threat Intelligence

Figure 4 for Enabling Efficient Cyber Threat Hunting With Cyber Threat Intelligence

Abstract:Log-based cyber threat hunting has emerged as an important solution to counter sophisticated cyber attacks. However, existing approaches require non-trivial efforts of manual query construction and have overlooked the rich external knowledge about threat behaviors provided by open-source Cyber Threat Intelligence (OSCTI). To bridge the gap, we propose EffHunter, a system that facilitates cyber threat hunting in computer systems using OSCTI. Built upon mature system auditing frameworks, EffHunter provides (1) an unsupervised, light-weight, and accurate NLP pipeline that extracts structured threat behaviors from unstructured OSCTI text, (2) a concise and expressive domain-specific query language, TBQL, to hunt for malicious system activities, (3) a query synthesis mechanism that automatically synthesizes a TBQL query for threat hunting from the extracted threat behaviors, and (4) an efficient query execution engine to search the big audit logging data. Evaluations on a broad set of attack cases demonstrate the accuracy and efficiency of EffHunter in enabling practical threat hunting.

Via

Access Paper or Ask Questions

Behavior Query Discovery in System-Generated Temporal Graphs

Nov 19, 2015

Bo Zong, Xusheng Xiao, Zhichun Li, Zhenyu Wu, Zhiyun Qian, Xifeng Yan, Ambuj K. Singh, Guofei Jiang

Figure 1 for Behavior Query Discovery in System-Generated Temporal Graphs

Figure 2 for Behavior Query Discovery in System-Generated Temporal Graphs

Figure 3 for Behavior Query Discovery in System-Generated Temporal Graphs

Figure 4 for Behavior Query Discovery in System-Generated Temporal Graphs

Abstract:Computer system monitoring generates huge amounts of logs that record the interaction of system entities. How to query such data to better understand system behaviors and identify potential system risks and malicious behaviors becomes a challenging task for system administrators due to the dynamics and heterogeneity of the data. System monitoring data are essentially heterogeneous temporal graphs with nodes being system entities and edges being their interactions over time. Given the complexity of such graphs, it becomes time-consuming for system administrators to manually formulate useful queries in order to examine abnormal activities, attacks, and vulnerabilities in computer systems. In this work, we investigate how to query temporal graphs and treat query formulation as a discriminative temporal graph pattern mining problem. We introduce TGMiner to mine discriminative patterns from system logs, and these patterns can be taken as templates for building more complex queries. TGMiner leverages temporal information in graphs to prune graph patterns that share similar growth trend without compromising pattern quality. Experimental results on real system data show that TGMiner is 6-32 times faster than baseline methods. The discovered patterns were verified by system experts; they achieved high precision (97%) and recall (91%).

* The full version of the paper "Behavior Query Discovery in System-Generated Temporal Graphs", to appear in VLDB'16

Via

Access Paper or Ask Questions