Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Topic:Gaze Estimation

What is Gaze Estimation? Gaze estimation is the process of predicting where a person is looking based on their eye movements.

Electrode Clustering and Bandpass Analysis of EEG Data for Gaze Estimation

Feb 19, 2023

Ard Kastrati, Martyna Beata Plomecka, Joël Küchler, Nicolas Langer, Roger Wattenhofer

Abstract:In this study, we validate the findings of previously published papers, showing the feasibility of an Electroencephalography (EEG) based gaze estimation. Moreover, we extend previous research by demonstrating that with only a slight drop in model performance, we can significantly reduce the number of electrodes, indicating that a high-density, expensive EEG cap is not necessary for the purposes of EEG-based eye tracking. Using data-driven approaches, we establish which electrode clusters impact gaze estimation and how the different types of EEG data preprocessing affect the models' performance. Finally, we also inspect which recorded frequencies are most important for the defined tasks.

* Gaze Meets Machine Learning Workshop (GMML@NeurIPS), New Orleans, Louisiana, USA, December 2022

Via

Access Paper or Ask Questions

Towards Precision in Appearance-based Gaze Estimation in the Wild

Feb 14, 2023

Murthy L. R. D., Abhishek Mukhopadhyay, Shambhavi Aggarwal, Ketan Anand, Pradipta Biswas

Abstract:Appearance-based gaze estimation systems have shown great progress recently, yet the performance of these techniques depend on the datasets used for training. Most of the existing gaze estimation datasets setup in interactive settings were recorded in laboratory conditions and those recorded in the wild conditions display limited head pose and illumination variations. Further, we observed little attention so far towards precision evaluations of existing gaze estimation approaches. In this work, we present a large gaze estimation dataset, PARKS-Gaze, with wider head pose and illumination variation and with multiple samples for a single Point of Gaze (PoG). The dataset contains 974 minutes of data from 28 participants with a head pose range of 60 degrees in both yaw and pitch directions. Our within-dataset and cross-dataset evaluations and precision evaluations indicate that the proposed dataset is more challenging and enable models to generalize on unseen participants better than the existing in-the-wild datasets. The project page can be accessed here: https://github.com/lrdmurthy/PARKS-Gaze

Via

Access Paper or Ask Questions

Eye-tracked Virtual Reality: A Comprehensive Survey on Methods and Privacy Challenges

May 23, 2023

Efe Bozkir, Süleyman Özdel, Mengdi Wang, Brendan David-John, Hong Gao, Kevin Butler, Eakta Jain, Enkelejda Kasneci

Figure 1 for Eye-tracked Virtual Reality: A Comprehensive Survey on Methods and Privacy Challenges

Figure 2 for Eye-tracked Virtual Reality: A Comprehensive Survey on Methods and Privacy Challenges

Figure 3 for Eye-tracked Virtual Reality: A Comprehensive Survey on Methods and Privacy Challenges

Figure 4 for Eye-tracked Virtual Reality: A Comprehensive Survey on Methods and Privacy Challenges

Abstract:Latest developments in computer hardware, sensor technologies, and artificial intelligence can make virtual reality (VR) and virtual spaces an important part of human everyday life. Eye tracking offers not only a hands-free way of interaction but also the possibility of a deeper understanding of human visual attention and cognitive processes in VR. Despite these possibilities, eye-tracking data also reveal privacy-sensitive attributes of users when it is combined with the information about the presented stimulus. To address these possibilities and potential privacy issues, in this survey, we first cover major works in eye tracking, VR, and privacy areas between the years 2012 and 2022. While eye tracking in the VR part covers the complete pipeline of eye-tracking methodology from pupil detection and gaze estimation to offline use and analyses, as for privacy and security, we focus on eye-based authentication as well as computational methods to preserve the privacy of individuals and their eye-tracking data in VR. Later, taking all into consideration, we draw three main directions for the research community by mainly focusing on privacy challenges. In summary, this survey provides an extensive literature review of the utmost possibilities with eye tracking in VR and the privacy implications of those possibilities.

* This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Via

Access Paper or Ask Questions

Toward Super-Resolution for Appearance-Based Gaze Estimation

Mar 17, 2023

Galen O'Shea, Majid Komeili

Abstract:Gaze tracking is a valuable tool with a broad range of applications in various fields, including medicine, psychology, virtual reality, marketing, and safety. Therefore, it is essential to have gaze tracking software that is cost-efficient and high-performing. Accurately predicting gaze remains a difficult task, particularly in real-world situations where images are affected by motion blur, video compression, and noise. Super-resolution has been shown to improve image quality from a visual perspective. This work examines the usefulness of super-resolution for improving appearance-based gaze tracking. We show that not all SR models preserve the gaze direction. We propose a two-step framework based on SwinIR super-resolution model. The proposed method consistently outperforms the state-of-the-art, particularly in scenarios involving low-resolution or degraded images. Furthermore, we examine the use of super-resolution through the lens of self-supervised learning for gaze prediction. Self-supervised learning aims to learn from unlabelled data to reduce the amount of required labeled data for downstream tasks. We propose a novel architecture called SuperVision by fusing an SR backbone network to a ResNet18 (with some skip connections). The proposed SuperVision method uses 5x less labeled data and yet outperforms, by 15%, the state-of-the-art method of GazeTR which uses 100% of training data.

Via

Access Paper or Ask Questions

Accurate Gaze Estimation using an Active-gaze Morphable Model

Jan 30, 2023

Hao Sun, Nick Pears

Abstract:Rather than regressing gaze direction directly from images, we show that adding a 3D shape model can: i) improve gaze estimation accuracy, ii) perform well with lower resolution inputs and iii) provide a richer understanding of the eye-region and its constituent gaze system. Specifically, we use an `eyes and nose' 3D morphable model (3DMM) to capture the eye-region 3D facial geometry and appearance and we equip this with a geometric vergence model of gaze to give an `active-gaze 3DMM'. We show that our approach achieves state-of-the-art results on the Eyediap dataset and we present an ablation study. Our method can learn with only the ground truth gaze target point and the camera parameters, without access to the ground truth gaze origin points, thus widening the applicability of our approach compared to other methods.

Via

Access Paper or Ask Questions

Interaction-aware Joint Attention Estimation Using People Attributes

Aug 10, 2023

Chihiro Nakatani, Hiroaki Kawashima, Norimichi Ukita

Abstract:This paper proposes joint attention estimation in a single image. Different from related work in which only the gaze-related attributes of people are independently employed, (I) their locations and actions are also employed as contextual cues for weighting their attributes, and (ii) interactions among all of these attributes are explicitly modeled in our method. For the interaction modeling, we propose a novel Transformer-based attention network to encode joint attention as low-dimensional features. We introduce a specialized MLP head with positional embedding to the Transformer so that it predicts pixelwise confidence of joint attention for generating the confidence heatmap. This pixelwise prediction improves the heatmap accuracy by avoiding the ill-posed problem in which the high-dimensional heatmap is predicted from the low-dimensional features. The estimated joint attention is further improved by being integrated with general image-based attention estimation. Our method outperforms SOTA methods quantitatively in comparative experiments. Code: https://anonymous.4open.science/r/anonymized_codes-ECA4.

* Accepted to ICCV2023

Via

Access Paper or Ask Questions

NeRF-Gaze: A Head-Eye Redirection Parametric Model for Gaze Estimation

Dec 30, 2022

Pengwei Yin, Jiawu Dai, Jingjing Wang, Di Xie, Shiliang Pu

Abstract:Gaze estimation is the fundamental basis for many visual tasks. Yet, the high cost of acquiring gaze datasets with 3D annotations hinders the optimization and application of gaze estimation models. In this work, we propose a novel Head-Eye redirection parametric model based on Neural Radiance Field, which allows dense gaze data generation with view consistency and accurate gaze direction. Moreover, our head-eye redirection parametric model can decouple the face and eyes for separate neural rendering, so it can achieve the purpose of separately controlling the attributes of the face, identity, illumination, and eye gaze direction. Thus diverse 3D-aware gaze datasets could be obtained by manipulating the latent code belonging to different face attributions in an unsupervised manner. Extensive experiments on several benchmarks demonstrate the effectiveness of our method in domain generalization and domain adaptation for gaze estimation tasks.

* 10 pages, 8 figures, submitted to CVPR 2023

Via

Access Paper or Ask Questions

Precise localization of corneal reflections in eye images using deep learning trained on synthetic data

Apr 12, 2023

Sean Anthony Byrne, Marcus Nyström, Virmarie Maquiling, Enkelejda Kasneci, Diederick C. Niehorster

Abstract:We present a deep learning method for accurately localizing the center of a single corneal reflection (CR) in an eye image. Unlike previous approaches, we use a convolutional neural network (CNN) that was trained solely using simulated data. Using only simulated data has the benefit of completely sidestepping the time-consuming process of manual annotation that is required for supervised training on real eye images. To systematically evaluate the accuracy of our method, we first tested it on images with simulated CRs placed on different backgrounds and embedded in varying levels of noise. Second, we tested the method on high-quality videos captured from real eyes. Our method outperformed state-of-the-art algorithmic methods on real eye images with a 35% reduction in terms of spatial precision, and performed on par with state-of-the-art on simulated images in terms of spatial accuracy.We conclude that our method provides a precise method for CR center localization and provides a solution to the data availability problem which is one of the important common roadblocks in the development of deep learning models for gaze estimation. Due to the superior CR center localization and ease of application, our method has the potential to improve the accuracy and precision of CR-based eye trackers

Via

Access Paper or Ask Questions

Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability

Sep 30, 2023

Taichi Higasa, Keitaro Tanaka, Qi Feng, Shigeo Morishima

Figure 1 for Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability

Figure 2 for Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability

Figure 3 for Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability

Figure 4 for Gaze-Driven Sentence Simplification for Language Learners: Enhancing Comprehension and Readability

Abstract:Language learners should regularly engage in reading challenging materials as part of their study routine. Nevertheless, constantly referring to dictionaries is time-consuming and distracting. This paper presents a novel gaze-driven sentence simplification system designed to enhance reading comprehension while maintaining their focus on the content. Our system incorporates machine learning models tailored to individual learners, combining eye gaze features and linguistic features to assess sentence comprehension. When the system identifies comprehension difficulties, it provides simplified versions by replacing complex vocabulary and grammar with simpler alternatives via GPT-3.5. We conducted an experiment with 19 English learners, collecting data on their eye movements while reading English text. The results demonstrated that our system is capable of accurately estimating sentence-level comprehension. Additionally, we found that GPT-3.5 simplification improved readability in terms of traditional readability metrics and individual word difficulty, paraphrasing across different linguistic levels.

* Accepted by ACM ICMI 2023 workshops (Multimodal, Interactive Interfaces for Education)

Via

Access Paper or Ask Questions

Contrastive Weighted Learning for Near-Infrared Gaze Estimation

Nov 06, 2022

Adam Lee

Abstract:Appearance-based gaze estimation has been very successful with the use of deep learning. Many following works improved domain generalization for gaze estimation. However, even though there has been much progress in domain generalization for gaze estimation, most of the recent work have been focused on cross-dataset performance -- accounting for different distributions in illuminations, head pose, and lighting. Although improving gaze estimation in different distributions of RGB images is important, near-infrared image based gaze estimation is also critical for gaze estimation in dark settings. Also there are inherent limitations relying solely on supervised learning for regression tasks. This paper contributes to solving these problems and proposes GazeCWL, a novel framework for gaze estimation with near-infrared images using contrastive learning. This leverages adversarial attack techniques for data augmentation and a novel contrastive loss function specifically for regression tasks that effectively clusters the features of different samples in the latent space. Our model outperforms previous domain generalization models in infrared image based gaze estimation and outperforms the baseline by 45.6\% while improving the state-of-the-art by 8.6\%, we demonstrate the efficacy of our method.

Via

Access Paper or Ask Questions

Topic:Gaze Estimation

Papers and Code