Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Carl James Debono

Enhancing Object Detection with Privileged Information: A Model-Agnostic Teacher-Student Approach

Jan 05, 2026

Matthias Bartolo, Dylan Seychell, Gabriel Hili, Matthew Montebello, Carl James Debono, Saviour Formosa, Konstantinos Makantasis

Abstract:This paper investigates the integration of the Learning Using Privileged Information (LUPI) paradigm in object detection to exploit fine-grained, descriptive information available during training but not at inference. We introduce a general, model-agnostic methodology for injecting privileged information-such as bounding box masks, saliency maps, and depth cues-into deep learning-based object detectors through a teacher-student architecture. Experiments are conducted across five state-of-the-art object detection models and multiple public benchmarks, including UAV-based litter detection datasets and Pascal VOC 2012, to assess the impact on accuracy, generalization, and computational efficiency. Our results demonstrate that LUPI-trained students consistently outperform their baseline counterparts, achieving significant boosts in detection accuracy with no increase in inference complexity or model size. Performance improvements are especially marked for medium and large objects, while ablation studies reveal that intermediate weighting of teacher guidance optimally balances learning from privileged and standard inputs. The findings affirm that the LUPI framework provides an effective and practical strategy for advancing object detection systems in both resource-constrained and real-world settings.

* Code available on GitHub: https://github.com/mbar0075/lupi-for-object-detection

Via

Access Paper or Ask Questions

A Deep Learning Framework for Visual Attention Prediction and Analysis of News Interfaces

Mar 21, 2025

Matthew Kenely, Dylan Seychell, Carl James Debono, Chris Porter

Figure 1 for A Deep Learning Framework for Visual Attention Prediction and Analysis of News Interfaces

Figure 2 for A Deep Learning Framework for Visual Attention Prediction and Analysis of News Interfaces

Figure 3 for A Deep Learning Framework for Visual Attention Prediction and Analysis of News Interfaces

Figure 4 for A Deep Learning Framework for Visual Attention Prediction and Analysis of News Interfaces

Abstract:News outlets' competition for attention in news interfaces has highlighted the need for demographically-aware saliency prediction models. Despite recent advancements in saliency detection applied to user interfaces (UI), existing datasets are limited in size and demographic representation. We present a deep learning framework that enhances the SaRa (Saliency Ranking) model with DeepGaze IIE, improving Salient Object Ranking (SOR) performance by 10.7%. Our framework optimizes three key components: saliency map generation, grid segment scoring, and map normalization. Through a two-fold experiment using eye-tracking (30 participants) and mouse-tracking (375 participants aged 13--70), we analyze attention patterns across demographic groups. Statistical analysis reveals significant age-based variations (p < 0.05, {\epsilon^2} = 0.042), with older users (36--70) engaging more with textual content and younger users (13--35) interacting more with images. Mouse-tracking data closely approximates eye-tracking behavior (sAUC = 0.86) and identifies UI elements that immediately stand out, validating its use in large-scale studies. We conclude that saliency studies should prioritize gathering data from a larger, demographically representative sample and report exact demographic distributions.

* This is a preprint submitted to the 2025 IEEE Conference on Artificial Intelligence (CAI)

Via

Access Paper or Ask Questions