Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Stephen Roberts

Prediction-Oriented Subsampling from Data Streams

Aug 05, 2025

Benedetta Lavinia Mussati, Freddie Bickford Smith, Tom Rainforth, Stephen Roberts

Abstract:Data is often generated in streams, with new observations arriving over time. A key challenge for learning models from data streams is capturing relevant information while keeping computational costs manageable. We explore intelligent data subsampling for offline learning, and argue for an information-theoretic method centred on reducing uncertainty in downstream predictions of interest. Empirically, we demonstrate that this prediction-oriented approach performs better than a previously proposed information-theoretic technique on two widely studied problems. At the same time, we highlight that reliably achieving strong performance in practice requires careful model design.

* Published at CoLLAs 2025

Via

Access Paper or Ask Questions

First observations of the seiche that shook the world

Nov 04, 2024

Thomas Monahan, Tianning Tang, Stephen Roberts, Thomas A. A. Adcock

Figure 1 for First observations of the seiche that shook the world

Figure 2 for First observations of the seiche that shook the world

Figure 3 for First observations of the seiche that shook the world

Figure 4 for First observations of the seiche that shook the world

Abstract:On September 16th, 2023, an anomalous 10.88 mHz seismic signal was observed globally, persisting for 9 days. One month later an identical signal appeared, lasting for another week. Several studies have theorized that these signals were produced by seiches which formed after two landslide generated mega-tsunamis in an East-Greenland fjord. This theory is supported by seismic inversions, and analytical and numerical modeling, but no direct observations have been made -- until now. Using data from the new Surface Water Ocean Topography mission, we present the first observations of this phenomenon. By ruling out other oceanographic processes, we validate the seiche theory of previous authors and independently estimate its initial amplitude at 7.9 m using Bayesian machine learning and seismic data. This study demonstrates the value of satellite altimetry for studying extreme events, while also highlighting the need for specialized methods to address the altimetric data's limitations, namely temporal sparsity. These data and approaches will help in understanding future unseen extremes driven by climate change.

* 19 pages, 9 figures

Via

Access Paper or Ask Questions

Deep Learning for Options Trading: An End-To-End Approach

Jul 31, 2024

Wee Ling Tan, Stephen Roberts, Stefan Zohren

Figure 1 for Deep Learning for Options Trading: An End-To-End Approach

Figure 2 for Deep Learning for Options Trading: An End-To-End Approach

Figure 3 for Deep Learning for Options Trading: An End-To-End Approach

Figure 4 for Deep Learning for Options Trading: An End-To-End Approach

Abstract:We introduce a novel approach to options trading strategies using a highly scalable and data-driven machine learning algorithm. In contrast to traditional approaches that often require specifications of underlying market dynamics or assumptions on an option pricing model, our models depart fundamentally from the need for these prerequisites, directly learning non-trivial mappings from market data to optimal trading signals. Backtesting on more than a decade of option contracts for equities listed on the S&P 100, we demonstrate that deep learning models trained according to our end-to-end approach exhibit significant improvements in risk-adjusted performance over existing rules-based trading strategies. We find that incorporating turnover regularization into the models leads to further performance enhancements at prohibitively high levels of transaction costs.

Via

Access Paper or Ask Questions

Learning to Learn Financial Networks for Optimising Momentum Strategies

Aug 23, 2023

Xingyue Pu, Stefan Zohren, Stephen Roberts, Xiaowen Dong

Abstract:Network momentum provides a novel type of risk premium, which exploits the interconnections among assets in a financial network to predict future returns. However, the current process of constructing financial networks relies heavily on expensive databases and financial expertise, limiting accessibility for small-sized and academic institutions. Furthermore, the traditional approach treats network construction and portfolio optimisation as separate tasks, potentially hindering optimal portfolio performance. To address these challenges, we propose L2GMOM, an end-to-end machine learning framework that simultaneously learns financial networks and optimises trading signals for network momentum strategies. The model of L2GMOM is a neural network with a highly interpretable forward propagation architecture, which is derived from algorithm unrolling. The L2GMOM is flexible and can be trained with diverse loss functions for portfolio performance, e.g. the negative Sharpe ratio. Backtesting on 64 continuous future contracts demonstrates a significant improvement in portfolio profitability and risk control, with a Sharpe ratio of 1.74 across a 20-year period.

* 9 pages

Via

Access Paper or Ask Questions

Network Momentum across Asset Classes

Aug 22, 2023

Xingyue Pu, Stephen Roberts, Xiaowen Dong, Stefan Zohren

Abstract:We investigate the concept of network momentum, a novel trading signal derived from momentum spillover across assets. Initially observed within the confines of pairwise economic and fundamental ties, such as the stock-bond connection of the same company and stocks linked through supply-demand chains, momentum spillover implies a propagation of momentum risk premium from one asset to another. The similarity of momentum risk premium, exemplified by co-movement patterns, has been spotted across multiple asset classes including commodities, equities, bonds and currencies. However, studying the network effect of momentum spillover across these classes has been challenging due to a lack of readily available common characteristics or economic ties beyond the company level. In this paper, we explore the interconnections of momentum features across a diverse range of 64 continuous future contracts spanning these four classes. We utilise a linear and interpretable graph learning model with minimal assumptions to reveal the intricacies of the momentum spillover network. By leveraging the learned networks, we construct a network momentum strategy that exhibits a Sharpe ratio of 1.5 and an annual return of 22%, after volatility scaling, from 2000 to 2022. This paper pioneers the examination of momentum spillover across multiple asset classes using only pricing data, presents a multi-asset investment strategy based on network momentum, and underscores the effectiveness of this strategy through robust empirical analysis.

* 27 pages

Via

Access Paper or Ask Questions

The instabilities of large learning rate training: a loss landscape view

Jul 22, 2023

Lawrence Wang, Stephen Roberts

Abstract:Modern neural networks are undeniably successful. Numerous works study how the curvature of loss landscapes can affect the quality of solutions. In this work we study the loss landscape by considering the Hessian matrix during network training with large learning rates - an attractive regime that is (in)famously unstable. We characterise the instabilities of gradient descent, and we observe the striking phenomena of \textit{landscape flattening} and \textit{landscape shift}, both of which are intimately connected to the instabilities of training.

* arXiv admin note: text overlap with arXiv:2305.18490

Via

Access Paper or Ask Questions

G-TRACER: Expected Sharpness Optimization

Jun 24, 2023

John Williams, Stephen Roberts

Figure 1 for G-TRACER: Expected Sharpness Optimization

Figure 2 for G-TRACER: Expected Sharpness Optimization

Figure 3 for G-TRACER: Expected Sharpness Optimization

Abstract:We propose a new regularization scheme for the optimization of deep learning architectures, G-TRACER ("Geometric TRACE Ratio"), which promotes generalization by seeking flat minima, and has a sound theoretical basis as an approximation to a natural-gradient descent based optimization of a generalized Bayes objective. By augmenting the loss function with a TRACER, curvature-regularized optimizers (eg SGD-TRACER and Adam-TRACER) are simple to implement as modifications to existing optimizers and don't require extensive tuning. We show that the method converges to a neighborhood (depending on the regularization strength) of a local minimum of the unregularized objective, and demonstrate competitive performance on a number of benchmark computer vision and NLP datasets, with a particular focus on challenging low signal-to-noise ratio problems.

* 16 pages, 2 figures

Via

Access Paper or Ask Questions

Spatio-Temporal Momentum: Jointly Learning Time-Series and Cross-Sectional Strategies

Feb 20, 2023

Wee Ling Tan, Stephen Roberts, Stefan Zohren

Figure 1 for Spatio-Temporal Momentum: Jointly Learning Time-Series and Cross-Sectional Strategies

Figure 2 for Spatio-Temporal Momentum: Jointly Learning Time-Series and Cross-Sectional Strategies

Figure 3 for Spatio-Temporal Momentum: Jointly Learning Time-Series and Cross-Sectional Strategies

Figure 4 for Spatio-Temporal Momentum: Jointly Learning Time-Series and Cross-Sectional Strategies

Abstract:We introduce Spatio-Temporal Momentum strategies, a class of models that unify both time-series and cross-sectional momentum strategies by trading assets based on their cross-sectional momentum features over time. While both time-series and cross-sectional momentum strategies are designed to systematically capture momentum risk premia, these strategies are regarded as distinct implementations and do not consider the concurrent relationship and predictability between temporal and cross-sectional momentum features of different assets. We model spatio-temporal momentum with neural networks of varying complexities and demonstrate that a simple neural network with only a single fully connected layer learns to simultaneously generate trading signals for all assets in a portfolio by incorporating both their time-series and cross-sectional momentum features. Backtesting on portfolios of 46 actively-traded US equities and 12 equity index futures contracts, we demonstrate that the model is able to retain its performance over benchmarks in the presence of high transaction costs of up to 5-10 basis points. In particular, we find that the model when coupled with least absolute shrinkage and turnover regularization results in the best performance over various transaction cost scenarios.

Via

Access Paper or Ask Questions

A large-scale and PCR-referenced vocal audio dataset for COVID-19

Dec 15, 2022

Jobie Budd, Kieran Baker, Emma Karoune, Harry Coppock, Selina Patel, Ana Tendero Cañadas, Alexander Titcomb, Richard Payne, David Hurley, Sabrina Egglestone(+16 more)

Abstract:The UK COVID-19 Vocal Audio Dataset is designed for the training and evaluation of machine learning models that classify SARS-CoV-2 infection status or associated respiratory symptoms using vocal audio. The UK Health Security Agency recruited voluntary participants through the national Test and Trace programme and the REACT-1 survey in England from March 2021 to March 2022, during dominant transmission of the Alpha and Delta SARS-CoV-2 variants and some Omicron variant sublineages. Audio recordings of volitional coughs, exhalations, and speech were collected in the 'Speak up to help beat coronavirus' digital survey alongside demographic, self-reported symptom and respiratory condition data, and linked to SARS-CoV-2 test results. The UK COVID-19 Vocal Audio Dataset represents the largest collection of SARS-CoV-2 PCR-referenced audio recordings to date. PCR results were linked to 70,794 of 72,999 participants and 24,155 of 25,776 positive cases. Respiratory symptoms were reported by 45.62% of participants. This dataset has additional potential uses for bioacoustics research, with 11.30% participants reporting asthma, and 27.20% with linked influenza PCR test results.

* 36 pages, 4 figures

Via

Access Paper or Ask Questions

Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers

Dec 15, 2022

Harry Coppock, George Nicholson, Ivan Kiskin, Vasiliki Koutra, Kieran Baker, Jobie Budd, Richard Payne, Emma Karoune, David Hurley, Alexander Titcomb(+15 more)

Figure 1 for Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers

Figure 2 for Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers

Figure 3 for Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers

Figure 4 for Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers

Abstract:Recent work has reported that AI classifiers trained on audio recordings can accurately predict severe acute respiratory syndrome coronavirus 2 (SARSCoV2) infection status. Here, we undertake a large scale study of audio-based deep learning classifiers, as part of the UK governments pandemic response. We collect and analyse a dataset of audio recordings from 67,842 individuals with linked metadata, including reverse transcription polymerase chain reaction (PCR) test outcomes, of whom 23,514 tested positive for SARS CoV 2. Subjects were recruited via the UK governments National Health Service Test-and-Trace programme and the REal-time Assessment of Community Transmission (REACT) randomised surveillance survey. In an unadjusted analysis of our dataset AI classifiers predict SARS-CoV-2 infection status with high accuracy (Receiver Operating Characteristic Area Under the Curve (ROCAUC) 0.846 [0.838, 0.854]) consistent with the findings of previous studies. However, after matching on measured confounders, such as age, gender, and self reported symptoms, our classifiers performance is much weaker (ROC-AUC 0.619 [0.594, 0.644]). Upon quantifying the utility of audio based classifiers in practical settings, we find them to be outperformed by simple predictive scores based on user reported symptoms.

Via

Access Paper or Ask Questions