Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Younes Karimi

Automated Detection of Doxing on Twitter

Feb 02, 2022

Younes Karimi, Anna Squicciarini, Shomir Wilson

Figure 1 for Automated Detection of Doxing on Twitter

Figure 2 for Automated Detection of Doxing on Twitter

Figure 3 for Automated Detection of Doxing on Twitter

Figure 4 for Automated Detection of Doxing on Twitter

Abstract:Doxing refers to the practice of disclosing sensitive personal information about a person without their consent. This form of cyberbullying is an unpleasant and sometimes dangerous phenomenon for online social networks. Although prior work exists on automated identification of other types of cyberbullying, a need exists for methods capable of detecting doxing on Twitter specifically. We propose and evaluate a set of approaches for automatically detecting second- and third-party disclosures on Twitter of sensitive private information, a subset of which constitutes doxing. We summarize our findings of common intentions behind doxing episodes and compare nine different approaches for automated detection based on string-matching and one-hot encoded heuristics, as well as word and contextualized string embedding representations of tweets. We identify an approach providing 96.86% accuracy and 97.37% recall using contextualized string embeddings and conclude by discussing the practicality of our proposed methods.

* 24 pages, 1 figure. Accepted in the 25th ACM Conference on Computer-Supported Cooperative Work and Social Computing (ACM CSCW 2022)

Via

Access Paper or Ask Questions

A Longitudinal Dataset of Twitter ISIS Users

Feb 02, 2022

Younes Karimi, Anna Squicciarini, Peter K. Forster, Kira M. Leavitt

Figure 1 for A Longitudinal Dataset of Twitter ISIS Users

Figure 2 for A Longitudinal Dataset of Twitter ISIS Users

Figure 3 for A Longitudinal Dataset of Twitter ISIS Users

Figure 4 for A Longitudinal Dataset of Twitter ISIS Users

Abstract:We present a large longitudinal dataset of tweets from two sets of users that are suspected to be affiliated with ISIS. These sets of users are identified based on a prior study and a campaign aimed at shutting down ISIS Twitter accounts. These users have engaged with known ISIS accounts at least once during 2014-2015 and are still active as of 2021. Some of them have directly supported the ISIS users and their tweets by retweeting them, and some of the users that have quoted tweets of ISIS, have uncertain connections to ISIS seed accounts. This study and the dataset represent a unique approach to analyzing ISIS data. Although much research exists on ISIS online activities, few studies have focused on individual accounts. Our approach to validating accounts as well as developing a framework for differentiating accounts' functionality (e.g., propaganda versus operational planning) offers a foundation for future research. We perform some descriptive statistics and preliminary analyses on our collected data to provide deeper insight and highlight the significance and practicality of such analyses. We further discuss several cross-disciplinary potential use cases and research directions.

* 10 pages, 7 figures; Submitted to the 16th International Conference on Web and Social Media (AAAI ICWSM-2022)

Via

Access Paper or Ask Questions

The Panacea Threat Intelligence and Active Defense Platform

Apr 20, 2020

Adam Dalton, Ehsan Aghaei, Ehab Al-Shaer, Archna Bhatia, Esteban Castillo, Zhuo Cheng, Sreekar Dhaduvai, Qi Duan, Md Mazharul Islam, Younes Karimi(+6 more)

Figure 1 for The Panacea Threat Intelligence and Active Defense Platform

Figure 2 for The Panacea Threat Intelligence and Active Defense Platform

Abstract:We describe Panacea, a system that supports natural language processing (NLP) components for active defenses against social engineering attacks. We deploy a pipeline of human language technology, including Ask and Framing Detection, Named Entity Recognition, Dialogue Engineering, and Stylometry. Panacea processes modern message formats through a plug-in architecture to accommodate innovative approaches for message analysis, knowledge representation and dialogue generation. The novelty of the Panacea system is that uses NLP for cyber defense and engages the attacker using bots to elicit evidence to attribute to the attacker and to waste the attacker's time and resources.

* Accepted at STOC

Via

Access Paper or Ask Questions