Carefully-designed prompts are key to inducing desired behavior in Large Language Models (LLMs). As a result, great effort has been dedicated to engineering prompts that guide LLMs toward particular behaviors. In this work, we propose an automatic prompt optimization framework, PROPANE, which aims to find a prompt that induces semantically similar outputs to a fixed set of examples without user intervention. We further demonstrate that PROPANE can be used to (a) improve existing prompts, and (b) discover semantically obfuscated prompts that transfer between models.
Graph neural networks (GNNs) have been utilized to create multi-layer graph models for a number of cybersecurity applications from fraud detection to software vulnerability analysis. Unfortunately, like traditional neural networks, GNNs also suffer from a lack of transparency, that is, it is challenging to interpret the model predictions. Prior works focused on specific factor explanations for a GNN model. In this work, we have designed and implemented Illuminati, a comprehensive and accurate explanation framework for cybersecurity applications using GNN models. Given a graph and a pre-trained GNN model, Illuminati is able to identify the important nodes, edges, and attributes that are contributing to the prediction while requiring no prior knowledge of GNN models. We evaluate Illuminati in two cybersecurity applications, i.e., code vulnerability detection and smart contract vulnerability detection. The experiments show that Illuminati achieves more accurate explanation results than state-of-the-art methods, specifically, 87.6% of subgraphs identified by Illuminati are able to retain their original prediction, an improvement of 10.3% over others at 77.3%. Furthermore, the explanation of Illuminati can be easily understood by the domain experts, suggesting the significant usefulness for the development of cybersecurity applications.
Recent pre-trained language models (PLMs) achieved great success on many natural language processing tasks through learning linguistic features and contextualized sentence representation. Since attributes captured in stacked layers of PLMs are not clearly identified, straightforward approaches such as embedding the last layer are commonly preferred to derive sentence representations from PLMs. This paper introduces the attention-based pooling strategy, which enables the model to preserve layer-wise signals captured in each layer and learn digested linguistic features for downstream tasks. The contrastive learning objective can adapt the layer-wise attention pooling to both unsupervised and supervised manners. It results in regularizing the anisotropic space of pre-trained embeddings and being more uniform. We evaluate our model on standard semantic textual similarity (STS) and semantic search tasks. As a result, our method improved the performance of the base contrastive learned BERT_base and variants.
This work investigates the problem of multi-agents trajectory prediction. Prior approaches lack of capability of capturing fine-grained dependencies among coordinated agents. In this paper, we propose a spatial-temporal trajectory prediction approach that is able to learn the strategy of a team with multiple coordinated agents. In particular, we use graph-based attention model to learn the dependency of the agents. In addition, instead of utilizing the recurrent networks (e.g., VRNN, LSTM), our method uses a Temporal Convolutional Network (TCN) as the sequential model to support long effective history and provide important features such as parallelism and stable gradients. We demonstrate the validation and effectiveness of our approach on two different sports game datasets: basketball and soccer datasets. The result shows that compared to related approaches, our model that infers the dependency of players yields substantially improved performance. Code is available at https://github.com/iHeartGraph/predict
Analysis of cyber relevant data has become an area of increasing focus. As larger percentages of businesses and governments begin to understand the implications of cyberattacks, the impetus for better cybersecurity solutions has increased. Unfortunately, current cybersecurity datasets either offer no ground truth or do so with anonymized data. The former leads to a quandary when verifying results and the latter can remove valuable information. Additionally, most existing datasets are large enough to make them unwieldy during prototype development. In this paper we have developed the PicoDomain dataset, a compact high-fidelity collection of Zeek logs from a realistic intrusion using relevant Tools, Techniques, and Procedures. While simulated on a small-scale network, this dataset consists of traffic typical of an enterprise network, which can be utilized for rapid validation and iterative development of analytics platforms. We have validated this dataset using traditional statistical analysis and off-the-shelf Machine Learning techniques.