Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Faizan Ali

Segment-Level Coherence for Robust Harmful Intent Probing in LLMs

Apr 16, 2026

Xuanli He, Bilgehan Sel, Faizan Ali, Jenny Bao, Hoagy Cunningham, Jerry Wei

Abstract:Large Language Models (LLMs) are increasingly exposed to adaptive jailbreaking, particularly in high-stakes Chemical, Biological, Radiological, and Nuclear (CBRN) domains. Although streaming probes enable real-time monitoring, they still make systematic errors. We identify a core issue: existing methods often rely on a few high-scoring tokens, leading to false alarms when sensitive CBRN terms appear in benign contexts. To address this, we introduce a streaming probing objective that requires multiple evidence tokens to consistently support a prediction, rather than relying on isolated spikes. This encourages more robust detection based on aggregated signals instead of single-token cues. At a fixed 1% false-positive rate, our method improves the true-positive rate by 35.55% relative to strong streaming baselines. We further observe substantial gains in AUROC, even when starting from near-saturated baseline performance (AUROC = 97.40%). We also show that probing Attention or MLP activations consistently outperforms residual-stream features. Finally, even when adversarial fine-tuning enables novel character-level ciphers, harmful intent remains detectable: probes developed for the base LLMs can be applied ``plug-and-play'' to these obfuscated attacks, achieving an AUROC of over 98.85%.

* preprint

Via

Access Paper or Ask Questions

Asia Cup 2025: A Structured T20 Match-Level Dataset and Exploratory Analysis for Cricket Analytics

Dec 17, 2025

Kousar Raza, Faizan Ali

Figure 1 for Asia Cup 2025: A Structured T20 Match-Level Dataset and Exploratory Analysis for Cricket Analytics

Figure 2 for Asia Cup 2025: A Structured T20 Match-Level Dataset and Exploratory Analysis for Cricket Analytics

Figure 3 for Asia Cup 2025: A Structured T20 Match-Level Dataset and Exploratory Analysis for Cricket Analytics

Figure 4 for Asia Cup 2025: A Structured T20 Match-Level Dataset and Exploratory Analysis for Cricket Analytics

Abstract:This paper presents a structured and comprehensive dataset corresponding to the 2025 Asia Cup T20 cricket tournament, designed to facilitate data-driven research in sports analytics. The dataset comprises records from all 19 matches of the tournament and includes 61 variables covering team scores, wickets, powerplay statistics, boundary counts, toss decisions, venues, and player-specific highlights. To demonstrate its analytical value, we conduct an exploratory data analysis focusing on team performance indicators, boundary distributions, and scoring patterns. The dataset is publicly released through Zenodo under a CC-BY 4.0 license to support reproducibility and further research in cricket analytics, predictive modeling, and strategic decision-making. This work contributes an open, machine-readable benchmark dataset for advancing cricket analytics research.

* Dataset available via Zenodo:{https://doi.org/10.5281/zenodo.17228056}. Source code and analysis scripts are publicly available at : https://github.com/kousarraza/AsiaCup2025

Via

Access Paper or Ask Questions