Abstract:As LLM-based agents increasingly browse the web on users' behalf, a natural question arises: can websites passively identify which underlying model powers an agent? Doing so would represent a significant security risk, enabling targeted attacks tailored to known model vulnerabilities. Across 14 frontier LLMs and four web environments spanning information retrieval and shopping tasks, we show that an agent's actions and interaction timings, captured via a passive JavaScript tracker, are sufficient to identify the underlying model with up to 96\% F1. We formalise this attack surface by demonstrating that classifiers trained on agent actions generalise across model sizes and families. We further show that strong classifiers can be trained from few interaction traces and that agent identity can be inferred early within an episode. Injecting randomised timing delays between actions substantially degrades classifier performance, but does not provide robust protection: a classifier retrained on delayed traces largely recovers performance. We release our harness and a labelled corpus of agent traces \href{https://github.com/KabakaWilliam/known_actions}{here}.




Abstract:We develop a means to detect ongoing per-country anomalies in the daily usage metrics of the Tor anonymous communication network, and demonstrate the applicability of this technique to identifying likely periods of internet censorship and related events. The presented approach identifies contiguous anomalous periods, rather than daily spikes or drops, and allows anomalies to be ranked according to deviation from expected behaviour. The developed method is implemented as a running tool, with outputs published daily by mailing list. This list highlights per-country anomalous Tor usage, and produces a daily ranking of countries according to the level of detected anomalous behaviour. This list has been active since August 2016, and is in use by a number of individuals, academics, and NGOs as an early warning system for potential censorship events. We focus on Tor, however the presented approach is more generally applicable to usage data of other services, both individually and in combination. We demonstrate that combining multiple data sources allows more specific identification of likely Tor blocking events. We demonstrate the our approach in comparison to existing anomaly detection tools, and against both known historical internet censorship events and synthetic datasets. Finally, we detail a number of significant recent anomalous events and behaviours identified by our tool.