Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yasin Yilmaz

Detecting Adversarial Data via Provable Adversarial Noise Amplification

May 04, 2026

Furkan Mumcu, Yasin Yilmaz

Abstract:The nonuniform and growing impact of adversarial noise across the layers of deep neural networks has been used in the literature, without a formal mathematical justification, to detect adversarial inputs and improve robustness. In this work, we study this phenomenon in detail and present a formal adversarial noise amplification theorem. We specify a set of sufficient conditions under which the adversarial noise amplification is mathematically guaranteed. Based on theoretical observations, we propose a novel training methodology with a custom spectral loss function and a specific architectural design to enhance the amplification signal for detecting adversarial data. Finally, we introduce a new, lightweight detection mechanism that leverages the enhanced amplification signal and operates entirely at inference time. To validate our approach, we demonstrate the detector's efficacy against both state-of-the-art attacks and a purpose-built adaptive attack, confirming that enhanced amplification can serve as a robust and reliable signal for adversarial defense.

Via

Access Paper or Ask Questions

QVAD: A Question-Centric Agentic Framework for Efficient and Training-Free Video Anomaly Detection

Apr 03, 2026

Lokman Bekit, Hamza Karim, Nghia T Nguyen, Yasin Yilmaz

Abstract:Video Anomaly Detection (VAD) is a fundamental challenge in computer vision, particularly due to the open-set nature of anomalies. While recent training-free approaches utilizing Vision-Language Models (VLMs) have shown promise, they typically rely on massive, resource-intensive foundation models to compensate for the ambiguity of static prompts. We argue that the bottleneck in VAD is not necessarily model capacity, but rather the static nature of inquiry. We propose QVAD, a question-centric agentic framework that treats VLM-LLM interaction as a dynamic dialogue. By iteratively refining queries based on visual context, our LLM agent guides smaller VLMs to produce high-fidelity captions and precise semantic reasoning without parameter updates. This ``prompt-updating" mechanism effectively unlocks the latent capabilities of lightweight models, enabling state-of-the-art performance on UCF-Crime, XD-Violence, and UBNormal using a fraction of the parameters required by competing methods. We further demonstrate exceptional generalizability on the single-scene ComplexVAD dataset. Crucially, QVAD achieves high inference speeds with minimal memory footprints, making advanced VAD capabilities deployable on resource-constrained edge devices.

Via

Access Paper or Ask Questions

Preserving Vertical Structure in 3D-to-2D Projection for Permafrost Thaw Mapping

Mar 17, 2026

Justin McMillen, Robert Van Alphen, Taha Sadeghi Chorsi, Jason Shabaga, Mel Rodgers, Rocco Malservisi, Timothy Dixon, Yasin Yilmaz

Abstract:Forecasting permafrost thaw from aerial lidar requires projecting 3D point cloud features onto 2D prediction grids, yet naive aggregation methods destroy the vertical structure critical in forest environments where ground, understory, and canopy carry distinct information about subsurface conditions. We propose a projection decoder with learned height embeddings that enable height-dependent feature transformations, allowing the network to differentiate ground-level signals from canopy returns. Combined with stratified sampling that ensures all forest strata remain represented, our approach preserves the vertical information critical for predicting subsurface conditions. Our approach pairs this decoder with a Point Transformer V3 encoder to predict dense thaw depth maps from drone-collected lidar over boreal forest in interior Alaska. Experiments demonstrate that z-stratified projection outperforms standard averaging-based methods, particularly in areas with complex vertical vegetation structure. Our method enables scalable, high-resolution monitoring of permafrost degradation from readily deployable UAV platforms.

Via

Access Paper or Ask Questions

Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

Mar 04, 2026

Furkan Mumcu, Yasin Yilmaz

Abstract:As Large Language Models (LLMs) transition into autonomous multi-agent ecosystems, robust minimax training becomes essential yet remains prone to instability when highly non-linear policies induce extreme local curvature in the inner maximization. Standard remedies that enforce global Jacobian bounds are overly conservative, suppressing sensitivity in all directions and inducing a large Price of Robustness. We introduce Adversarially-Aligned Jacobian Regularization (AAJR), a trajectory-aligned approach that controls sensitivity strictly along adversarial ascent directions. We prove that AAJR yields a strictly larger admissible policy class than global constraints under mild conditions, implying a weakly smaller approximation gap and reduced nominal performance degradation. Furthermore, we derive step-size conditions under which AAJR controls effective smoothness along optimization trajectories and ensures inner-loop stability. These results provide a structural theory for agentic robustness that decouples minimax stability from global expressivity restrictions.

Via

Access Paper or Ask Questions

Socially-Weighted Alignment: A Game-Theoretic Framework for Multi-Agent LLM Systems

Feb 16, 2026

Furkan Mumcu, Yasin Yilmaz

Abstract:Deploying large language model (LLM) agents in shared environments introduces a fundamental tension between individual alignment and collective stability: locally rational decisions can impose negative externalities that degrade system-level performance. We propose Socially-Weighted Alignment (SWA), a game-theoretic framework that modifies inference-time decision making by interpolating between an agent's private objective and an estimate of group welfare via a social weight $λ\in[0,1]$. In a shared-resource congestion game with $n$ agents and congestion severity $β$, we show that SWA induces a critical threshold $λ^*=(n-β)/(n-1)$ above which agents no longer have marginal incentive to increase demand under overload, yielding a phase transition from persistent congestion to stable operation near capacity. We further provide an inference-time algorithmic instantiation of SWA that does not require parameter updates or multi-agent reinforcement learning, and use a multi-agent simulation to empirically validate the predicted threshold behavior.

Via

Access Paper or Ask Questions

Agentic AI-Empowered Dynamic Survey Framework

Feb 03, 2026

Furkan Mumcu, Lokman Bekit, Michael J. Jones, Anoop Cherian, Yasin Yilmaz

Abstract:Survey papers play a central role in synthesizing and organizing scientific knowledge, yet they are increasingly strained by the rapid growth of research output. As new work continues to appear after publication, surveys quickly become outdated, contributing to redundancy and fragmentation in the literature. We reframe survey writing as a long-horizon maintenance problem rather than a one-time generation task, treating surveys as living documents that evolve alongside the research they describe. We propose an agentic Dynamic Survey Framework that supports the continuous updating of existing survey papers by incrementally integrating new work while preserving survey structure and minimizing unnecessary disruption. Using a retrospective experimental setup, we demonstrate that the proposed framework effectively identifies and incorporates emerging research while preserving the coherence and structure of existing surveys.

Via

Access Paper or Ask Questions

Universal and Efficient Detection of Adversarial Data through Nonuniform Impact on Network Layers

Jun 25, 2025

Furkan Mumcu, Yasin Yilmaz

Abstract:Deep Neural Networks (DNNs) are notoriously vulnerable to adversarial input designs with limited noise budgets. While numerous successful attacks with subtle modifications to original input have been proposed, defense techniques against these attacks are relatively understudied. Existing defense approaches either focus on improving DNN robustness by negating the effects of perturbations or use a secondary model to detect adversarial data. Although equally important, the attack detection approach, which is studied in this work, provides a more practical defense compared to the robustness approach. We show that the existing detection methods are either ineffective against the state-of-the-art attack techniques or computationally inefficient for real-time processing. We propose a novel universal and efficient method to detect adversarial examples by analyzing the varying degrees of impact of attacks on different DNN layers. {Our method trains a lightweight regression model that predicts deeper-layer features from early-layer features, and uses the prediction error to detect adversarial samples.} Through theoretical arguments and extensive experiments, we demonstrate that our detection method is highly effective, computationally efficient for real-time processing, compatible with any DNN architecture, and applicable across different domains, such as image, video, and audio.

* Transactions on Machine Learning Research, June 2025
* arXiv admin note: substantial text overlap with arXiv:2410.17442

Via

Access Paper or Ask Questions

Reliable Radiologic Skeletal Muscle Area Assessment -- A Biomarker for Cancer Cachexia Diagnosis

Mar 19, 2025

Sabeen Ahmed, Nathan Parker, Margaret Park, Daniel Jeong, Lauren Peres, Evan W. Davis, Jennifer B. Permuth, Erin Siegel, Matthew B. Schabath, Yasin Yilmaz(+1 more)

Abstract:Cancer cachexia is a common metabolic disorder characterized by severe muscle atrophy which is associated with poor prognosis and quality of life. Monitoring skeletal muscle area (SMA) longitudinally through computed tomography (CT) scans, an imaging modality routinely acquired in cancer care, is an effective way to identify and track this condition. However, existing tools often lack full automation and exhibit inconsistent accuracy, limiting their potential for integration into clinical workflows. To address these challenges, we developed SMAART-AI (Skeletal Muscle Assessment-Automated and Reliable Tool-based on AI), an end-to-end automated pipeline powered by deep learning models (nnU-Net 2D) trained on mid-third lumbar level CT images with 5-fold cross-validation, ensuring generalizability and robustness. SMAART-AI incorporates an uncertainty-based mechanism to flag high-error SMA predictions for expert review, enhancing reliability. We combined the SMA, skeletal muscle index, BMI, and clinical data to train a multi-layer perceptron (MLP) model designed to predict cachexia at the time of cancer diagnosis. Tested on the gastroesophageal cancer dataset, SMAART-AI achieved a Dice score of 97.80% +/- 0.93%, with SMA estimated across all four datasets in this study at a median absolute error of 2.48% compared to manual annotations with SliceOmatic. Uncertainty metrics-variance, entropy, and coefficient of variation-strongly correlated with SMA prediction errors (0.83, 0.76, and 0.73 respectively). The MLP model predicts cachexia with 79% precision, providing clinicians with a reliable tool for early diagnosis and intervention. By combining automation, accuracy, and uncertainty awareness, SMAART-AI bridges the gap between research and clinical application, offering a transformative approach to managing cancer cachexia.

* 47 pages, 19 figures, 9 Tables

Via

Access Paper or Ask Questions

Multimodal AI-driven Biomarker for Early Detection of Cancer Cachexia

Mar 09, 2025

Sabeen Ahmed, Nathan Parker, Margaret Park, Evan W. Davis, Jennifer B. Permuth, Matthew B. Schabath, Yasin Yilmaz, Ghulam Rasool

Abstract:Cancer cachexia is a multifactorial syndrome characterized by progressive muscle wasting, metabolic dysfunction, and systemic inflammation, leading to reduced quality of life and increased mortality. Despite extensive research, no single definitive biomarker exists, as cachexia-related indicators such as serum biomarkers, skeletal muscle measurements, and metabolic abnormalities often overlap with other conditions. Existing composite indices, including the Cancer Cachexia Index (CXI), Modified CXI (mCXI), and Cachexia Score (CASCO), integrate multiple biomarkers but lack standardized thresholds, limiting their clinical utility. This study proposes a multimodal AI-based biomarker for early cancer cachexia detection, leveraging open-source large language models (LLMs) and foundation models trained on medical data. The approach integrates heterogeneous patient data, including demographics, disease status, lab reports, radiological imaging (CT scans), and clinical notes, using a machine learning framework that can handle missing data. Unlike previous AI-based models trained on curated datasets, this method utilizes routinely collected clinical data, enhancing real-world applicability. Additionally, the model incorporates confidence estimation, allowing the identification of cases requiring expert review for precise clinical interpretation. Preliminary findings demonstrate that integrating multiple data modalities improves cachexia prediction accuracy at the time of cancer diagnosis. The AI-based biomarker dynamically adapts to patient-specific factors such as age, race, ethnicity, weight, cancer type, and stage, avoiding the limitations of fixed-threshold biomarkers. This multimodal AI biomarker provides a scalable and clinically viable solution for early cancer cachexia detection, facilitating personalized interventions and potentially improving treatment outcomes and patient survival.

* 17 pages, 6 figures, 3 Tables

Via

Access Paper or Ask Questions

ComplexVAD: Detecting Interaction Anomalies in Video

Jan 16, 2025

Furkan Mumcu, Michael J. Jones, Yasin Yilmaz, Anoop Cherian

Abstract:Existing video anomaly detection datasets are inadequate for representing complex anomalies that occur due to the interactions between objects. The absence of complex anomalies in previous video anomaly detection datasets affects research by shifting the focus onto simple anomalies. To address this problem, we introduce a new large-scale dataset: ComplexVAD. In addition, we propose a novel method to detect complex anomalies via modeling the interactions between objects using a scene graph with spatio-temporal attributes. With our proposed method and two other state-of-the-art video anomaly detection methods, we obtain baseline scores on ComplexVAD and demonstrate that our new method outperforms existing works.

* 16 pages, 11 figures, to appear in WACV Workshop ASTAD 2025

Via

Access Paper or Ask Questions