What is Fraud Detection? Fraud detection is a vital topic that applies to many industries including the financial sectors, banking, government agencies, insurance, and law enforcement, and more. Fraud endeavors have detected a radical rise in recent years, making this topic more critical than ever. Despite struggles on the part of the troubled organizations, hundreds of millions of dollars are lost to fraud each year. Because nearly a few samples confirm fraud in a vast community, locating these can be complex. Data mining and statistics help to predict and immediately distinguish fraud and take immediate action to minimize costs.
Papers and Code
Dec 21, 2024
Abstract:Anomaly detection using a network-based approach is one of the most efficient ways to identify abnormal events such as fraud, security breaches, and system faults in a variety of applied domains. While most of the earlier works address the complex nature of graph-structured data and predefined anomalies, the impact of data attributes and emerging anomalies are often neglected. This paper introduces DCOR, a novel approach on attributed networks that integrates reconstruction-based anomaly detection with Contrastive Learning. Utilizing a Graph Neural Network (GNN) framework, DCOR contrasts the reconstructed adjacency and feature matrices from both the original and augmented graphs to detect subtle anomalies. We employed comprehensive experimental studies on benchmark datasets through standard evaluation measures. The results show that DCOR significantly outperforms state-of-the-art methods. Obtained results demonstrate the efficacy of proposed approach in attributed networks with the potential of uncovering new patterns of anomalies.
* 12 pages, accepted at the Thirteenth International Conference on
Complex Networks and Their Applications
Via

Nov 01, 2024
Abstract:Credit card fraud detection is a critical concern for financial institutions, intensified by the rise of contactless payment technologies. While deep learning models offer high accuracy, their lack of explainability poses significant challenges in financial settings. This paper explores the integration of fuzzy logic into Deep Symbolic Regression (DSR) to enhance both performance and explainability in fraud detection. We investigate the effectiveness of different fuzzy logic implications, specifically {\L}ukasiewicz, G\"odel, and Product, in handling the complexity and uncertainty of fraud detection datasets. Our analysis suggest that the {\L}ukasiewicz implication achieves the highest F1-score and overall accuracy, while the Product implication offers a favorable balance between performance and explainability. Despite having a performance lower than state-of-the-art (SOTA) models due to information loss in data transformation, our approach provides novelty and insights into into integrating fuzzy logic into DSR for fraud detection, providing a comprehensive comparison between different implications and methods.
* 10 pages, 1 figure, published for XAI FIN 24
https://easychair.org/cfp/xaifin2024
Via

Nov 01, 2024
Abstract:This paper introduces a Machine Learning-Driven website Platform and Browser Extension designed to quickly enhance online security by providing real-time risk scoring and fraud detection for website legitimacy verification and consumer protection. The platform works seamlessly in the background to analyze website behavior, network traffic, and user interactions, offering immediate feedback and alerts when potential threats are detected. By integrating this system into a user-friendly browser extension, the platform empowers individuals to navigate the web safely, reducing the risk of engaging with fraudulent websites. Its real-time functionality is crucial in e-commerce and everyday browsing, where quick, actionable insights can prevent financial losses, identity theft, and exposure to malicious sites. This paper explores how this solution offers a practical, fast-acting tool for enhancing online consumer protection, underscoring its potential to play a critical role in safeguarding users and maintaining trust in digital transactions. The platform's focus on speed and efficiency makes it an essential asset for preventing fraud in today's increasingly digital world.
* Journal of Multidisciplinary Engineering Science and Technology
(JMEST) 2024
Via

Dec 11, 2024
Abstract:Face-swapping techniques have advanced rapidly with the evolution of deep learning, leading to widespread use and growing concerns about potential misuse, especially in cases of fraud. While many efforts have focused on detecting swapped face images or videos, these methods are insufficient for tracing the malicious users behind fraudulent activities. Intrusive watermark-based approaches also fail to trace unmarked identities, limiting their practical utility. To address these challenges, we introduce FaceTracer, the first non-intrusive framework specifically designed to trace the identity of the source person from swapped face images or videos. Specifically, FaceTracer leverages a disentanglement module that effectively suppresses identity information related to the target person while isolating the identity features of the source person. This allows us to extract robust identity information that can directly link the swapped face back to the original individual, aiding in uncovering the actors behind fraudulent activities. Extensive experiments demonstrate FaceTracer's effectiveness across various face-swapping techniques, successfully identifying the source person in swapped content and enabling the tracing of malicious actors involved in fraudulent activities. Additionally, FaceTracer shows strong transferability to unseen face-swapping methods including commercial applications and robustness against transmission distortions and adaptive attacks.
* 17 pages, 18 figures, under review
Via

Nov 19, 2024
Abstract:Graph anomaly detection (GAD) is a critical task in graph machine learning, with the primary objective of identifying anomalous nodes that deviate significantly from the majority. This task is widely applied in various real-world scenarios, including fraud detection and social network analysis. However, existing GAD methods still face two major challenges: (1) They are often limited to detecting anomalies in single-type interaction graphs and struggle with multiple interaction types in multiplex heterogeneous graphs; (2) In unsupervised scenarios, selecting appropriate anomaly score thresholds remains a significant challenge for accurate anomaly detection. To address the above challenges, we propose a novel Unsupervised Multiplex Graph Anomaly Detection method, named UMGAD. We first learn multi-relational correlations among nodes in multiplex heterogeneous graphs and capture anomaly information during node attribute and structure reconstruction through graph-masked autoencoder (GMAE). Then, to further weaken the influence of noise and redundant information on abnormal information extraction, we generate attribute-level and subgraph-level augmented-view graphs respectively, and perform attribute and structure reconstruction through GMAE. Finally, We learn to optimize node attributes and structural features through contrastive learning between original-view and augmented-view graphs to improve the model's ability to capture anomalies. Meanwhile, we also propose a new anomaly score threshold selection strategy, which allows the model to be independent of the ground truth in real unsupervised scenarios. Extensive experiments on four datasets show that our \model significantly outperforms state-of-the-art methods, achieving average improvements of 13.48% in AUC and 11.68% in Macro-F1 across all datasets.
Via

Oct 26, 2024
Abstract:As online fraud becomes more sophisticated and pervasive, traditional fraud detection methods are struggling to keep pace with the evolving tactics employed by fraudsters. This paper explores the transformative role of machine learning in addressing these challenges by offering more advanced, scalable, and adaptable solutions for fraud detection and prevention. By analyzing key models such as Random Forest, Neural Networks, and Gradient Boosting, this paper highlights the strengths of machine learning in processing vast datasets, identifying intricate fraud patterns, and providing real-time predictions that enable a proactive approach to fraud prevention. Unlike rule-based systems that react after fraud has occurred, machine learning models continuously learn from new data, adapting to emerging fraud schemes and reducing false positives, which ultimately minimizes financial losses. This research emphasizes the potential of machine learning to revolutionize fraud detection frameworks by making them more dynamic, efficient, and capable of handling the growing complexity of fraud across various industries. Future developments in machine learning, including deep learning and hybrid models, are expected to further enhance the predictive accuracy and applicability of these systems, ensuring that organizations remain resilient in the face of new and emerging fraud tactics.
* World Journal of Advanced Research and Reviews (2024)
Via

Nov 08, 2024
Abstract:Imbalanced learning occurs in classification settings where the distribution of class-labels is highly skewed in the training data, such as when predicting rare diseases or in fraud detection. This class imbalance presents a significant algorithmic challenge, which can be further exacerbated when privacy-preserving techniques such as differential privacy are applied to protect sensitive training data. Our work formalizes these challenges and provides a number of algorithmic solutions. We consider DP variants of pre-processing methods that privately augment the original dataset to reduce the class imbalance; these include oversampling, SMOTE, and private synthetic data generation. We also consider DP variants of in-processing techniques, which adjust the learning algorithm to account for the imbalance; these include model bagging, class-weighted empirical risk minimization and class-weighted deep learning. For each method, we either adapt an existing imbalanced learning technique to the private setting or demonstrate its incompatibility with differential privacy. Finally, we empirically evaluate these privacy-preserving imbalanced learning methods under various data and distributional settings. We find that private synthetic data methods perform well as a data pre-processing step, while class-weighted ERMs are an alternative in higher-dimensional settings where private synthetic data suffers from the curse of dimensionality.
* 14 pages
Via

Nov 05, 2024
Abstract:Knowledge Graphs have emerged as a compelling abstraction for capturing key relationship among the entities of interest to enterprises and for integrating data from heterogeneous sources. JPMorgan Chase (JPMC) is leading this trend by leveraging knowledge graphs across the organization for multiple mission critical applications such as risk assessment, fraud detection, investment advice, etc. A core problem in leveraging a knowledge graph is to link mentions (e.g., company names) that are encountered in textual sources to entities in the knowledge graph. Although several techniques exist for entity linking, they are tuned for entities that exist in Wikipedia, and fail to generalize for the entities that are of interest to an enterprise. In this paper, we propose a novel end-to-end neural entity linking model (JEL) that uses minimal context information and a margin loss to generate entity embeddings, and a Wide & Deep Learning model to match character and semantic information respectively. We show that JEL achieves the state-of-the-art performance to link mentions of company names in financial news with entities in our knowledge graph. We report on our efforts to deploy this model in the company-wide system to generate alerts in response to financial news. The methodology used for JEL is directly applicable and usable by other enterprises who need entity linking solutions for data that are unique to their respective situations.
* 8 pages, 4 figures, IAAI-21
Via

Oct 21, 2024
Abstract:With rapid transformation of technologies, the fusion of Artificial Intelligence (AI) and Machine Learning (ML) in finance is disrupting the entire ecosystem and operations which were followed for decades. The current landscape is where decisions are increasingly data-driven by financial institutions with an appetite for automation while mitigating risks. The segments of financial institutions which are getting heavily influenced are retail banking, wealth management, corporate banking & payment ecosystem. The solution ranges from onboarding the customers all the way fraud detection & prevention to enhancing the customer services. Financial Institutes are leap frogging with integration of Artificial Intelligence and Machine Learning in mainstream applications and enhancing operational efficiency through advanced predictive analytics, extending personalized customer experiences, and automation to minimize risk with fraud detection techniques. However, with Adoption of AI & ML, it is imperative that the financial institute also needs to address ethical and regulatory challenges, by putting in place robust governance frameworks and responsible AI practices.
* Transactions on Engineering and Computing Sciences - Vol. 12, No.
4 Publication Date: August 25, 2024
* 10 pages, 1 figure
Via

Oct 10, 2024
Abstract:The digital revolution has significantly impacted financial transactions, leading to a notable increase in credit card usage. However, this convenience comes with a trade-off: a substantial rise in fraudulent activities. Traditional machine learning methods for fraud detection often struggle to capture the inherent interconnectedness within financial data. This paper proposes a novel approach for credit card fraud detection that leverages Graph Neural Networks (GNNs) with attention mechanisms applied to heterogeneous graph representations of financial data. Unlike homogeneous graphs, heterogeneous graphs capture intricate relationships between various entities in the financial ecosystem, such as cardholders, merchants, and transactions, providing a richer and more comprehensive data representation for fraud analysis. To address the inherent class imbalance in fraud data, where genuine transactions significantly outnumber fraudulent ones, the proposed approach integrates an autoencoder. This autoencoder, trained on genuine transactions, learns a latent representation and flags deviations during reconstruction as potential fraud. This research investigates two key questions: (1) How effectively can a GNN with an attention mechanism detect and prevent credit card fraud when applied to a heterogeneous graph? (2) How does the efficacy of the autoencoder with attention approach compare to traditional methods? The results are promising, demonstrating that the proposed model outperforms benchmark algorithms such as Graph Sage and FI-GRL, achieving a superior AUC-PR of 0.89 and an F1-score of 0.81. This research significantly advances fraud detection systems and the overall security of financial transactions by leveraging GNNs with attention mechanisms and addressing class imbalance through an autoencoder.
Via
