Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

William J. Beksi

Energy-Based Open-Set Active Learning for Object Classification

Apr 22, 2026

Zongyao Lyu, William J. Beksi

Abstract:Active learning (AL) has emerged as a crucial methodology for minimizing labeling costs in deep learning by selecting the most valuable samples from a pool of unlabeled data for annotation. Traditional AL operates under a closed-set assumption, where all classes in the dataset are known and consistent. However, real-world scenarios often present open-set conditions in which unlabeled data contains both known and unknown classes. In such environments, standard AL techniques struggle. They can mistakenly query samples from unknown categories, leading to inefficient use of annotation budgets. In this paper, we propose a novel dual-stage energy-based framework for open-set AL. Our method employs two specialized energy-based models (EBMs). The first, an energy-based known/unknown separator, filters out samples likely to belong to unknown classes. The second, an energy-based sample scorer, assesses the informativeness of the filtered known samples. Using the energy landscape, our models distinguish between data points from known and unknown classes in the unlabeled pool by assigning lower energy to known samples and higher energy to unknown samples, ensuring that only samples from classes of interest are selected for labeling. By integrating these components, our approach ensures efficient and targeted sample selection, maximizing learning impact in each iteration. Experiments on 2D (CIFAR-10, CIFAR-100, TinyImageNet) and 3D (ModelNet40) object classification benchmarks demonstrates that our framework outperforms existing approaches, achieving superior annotation efficiency and classification performance in open-set environments.

* To be published in the 2026 International Conference on Pattern Recognition (ICPR)

Via

Access Paper or Ask Questions

UEOF: A Benchmark Dataset for Underwater Event-Based Optical Flow

Jan 15, 2026

Nick Truong, Pritam P. Karmokar, William J. Beksi

Abstract:Underwater imaging is fundamentally challenging due to wavelength-dependent light attenuation, strong scattering from suspended particles, turbidity-induced blur, and non-uniform illumination. These effects impair standard cameras and make ground-truth motion nearly impossible to obtain. On the other hand, event cameras offer microsecond resolution and high dynamic range. Nonetheless, progress on investigating event cameras for underwater environments has been limited due to the lack of datasets that pair realistic underwater optics with accurate optical flow. To address this problem, we introduce the first synthetic underwater benchmark dataset for event-based optical flow derived from physically-based ray-traced RGBD sequences. Using a modern video-to-event pipeline applied to rendered underwater videos, we produce realistic event data streams with dense ground-truth flow, depth, and camera motion. Moreover, we benchmark state-of-the-art learning-based and model-based optical flow prediction methods to understand how underwater light transport affects event formation and motion estimation accuracy. Our dataset establishes a new baseline for future development and evaluation of underwater event-based perception algorithms. The source code and dataset for this project are publicly available at https://robotic-vision-lab.github.io/ueof.

* To be presented at the 2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) Workshop on Event-Based Vision in the Era of Generative AI

Via

Access Paper or Ask Questions

CropNeRF: A Neural Radiance Field-Based Framework for Crop Counting

Jan 01, 2026

Md Ahmed Al Muzaddid, William J. Beksi

Abstract:Rigorous crop counting is crucial for effective agricultural management and informed intervention strategies. However, in outdoor field environments, partial occlusions combined with inherent ambiguity in distinguishing clustered crops from individual viewpoints poses an immense challenge for image-based segmentation methods. To address these problems, we introduce a novel crop counting framework designed for exact enumeration via 3D instance segmentation. Our approach utilizes 2D images captured from multiple viewpoints and associates independent instance masks for neural radiance field (NeRF) view synthesis. We introduce crop visibility and mask consistency scores, which are incorporated alongside 3D information from a NeRF model. This results in an effective segmentation of crop instances in 3D and highly-accurate crop counts. Furthermore, our method eliminates the dependence on crop-specific parameter tuning. We validate our framework on three agricultural datasets consisting of cotton bolls, apples, and pears, and demonstrate consistent counting performance despite major variations in crop color, shape, and size. A comparative analysis against the state of the art highlights superior performance on crop counting tasks. Lastly, we contribute a cotton plant dataset to advance further research on this topic.

* 8 pages, 10 figures, and 2 tables

Via

Access Paper or Ask Questions

CropTrack: A Tracking with Re-Identification Framework for Precision Agriculture

Dec 31, 2025

Md Ahmed Al Muzaddid, Jordan A. James, William J. Beksi

Abstract:Multiple-object tracking (MOT) in agricultural environments presents major challenges due to repetitive patterns, similar object appearances, sudden illumination changes, and frequent occlusions. Contemporary trackers in this domain rely on the motion of objects rather than appearance for association. Nevertheless, they struggle to maintain object identities when targets undergo frequent and strong occlusions. The high similarity of object appearances makes integrating appearance-based association nontrivial for agricultural scenarios. To solve this problem we propose CropTrack, a novel MOT framework based on the combination of appearance and motion information. CropTrack integrates a reranking-enhanced appearance association, a one-to-many association with appearance-based conflict resolution strategy, and an exponential moving average prototype feature bank to improve appearance-based association. Evaluated on publicly available agricultural MOT datasets, CropTrack demonstrates consistent identity preservation, outperforming traditional motion-based tracking methods. Compared to the state of the art, CropTrack achieves significant gains in identification F1 and association accuracy scores with a lower number of identity switches.

* 8 pages, 5 figures, and 3 tables

Via

Access Paper or Ask Questions

Inertia-Informed Orientation Priors for Event-Based Optical Flow Estimation

Nov 17, 2025

Pritam P. Karmokar, William J. Beksi

Figure 1 for Inertia-Informed Orientation Priors for Event-Based Optical Flow Estimation

Figure 2 for Inertia-Informed Orientation Priors for Event-Based Optical Flow Estimation

Figure 3 for Inertia-Informed Orientation Priors for Event-Based Optical Flow Estimation

Figure 4 for Inertia-Informed Orientation Priors for Event-Based Optical Flow Estimation

Abstract:Event cameras, by virtue of their working principle, directly encode motion within a scene. Many learning-based and model-based methods exist that estimate event-based optical flow, however the temporally dense yet spatially sparse nature of events poses significant challenges. To address these issues, contrast maximization (CM) is a prominent model-based optimization methodology that estimates the motion trajectories of events within an event volume by optimally warping them. Since its introduction, the CM framework has undergone a series of refinements by the computer vision community. Nonetheless, it remains a highly non-convex optimization problem. In this paper, we introduce a novel biologically-inspired hybrid CM method for event-based optical flow estimation that couples visual and inertial motion cues. Concretely, we propose the use of orientation maps, derived from camera 3D velocities, as priors to guide the CM process. The orientation maps provide directional guidance and constrain the space of estimated motion trajectories. We show that this orientation-guided formulation leads to improved robustness and convergence in event-based optical flow estimation. The evaluation of our approach on the MVSEC, DSEC, and ECD datasets yields superior accuracy scores over the state of the art.

* 13 pages, 9 figures, and 3 tables

Via

Access Paper or Ask Questions

Secrets of Edge-Informed Contrast Maximization for Event-Based Vision

Sep 22, 2024

Pritam P. Karmokar, Quan H. Nguyen, William J. Beksi

Abstract:Event cameras capture the motion of intensity gradients (edges) in the image plane in the form of rapid asynchronous events. When accumulated in 2D histograms, these events depict overlays of the edges in motion, consequently obscuring the spatial structure of the generating edges. Contrast maximization (CM) is an optimization framework that can reverse this effect and produce sharp spatial structures that resemble the moving intensity gradients by estimating the motion trajectories of the events. Nonetheless, CM is still an underexplored area of research with avenues for improvement. In this paper, we propose a novel hybrid approach that extends CM from uni-modal (events only) to bi-modal (events and edges). We leverage the underpinning concept that, given a reference time, optimally warped events produce sharp gradients consistent with the moving edge at that time. Specifically, we formalize a correlation-based objective to aid CM and provide key insights into the incorporation of multiscale and multireference techniques. Moreover, our edge-informed CM method yields superior sharpness scores and establishes new state-of-the-art event optical flow benchmarks on the MVSEC, DSEC, and ECD datasets.

* To be published in the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Via

Access Paper or Ask Questions

Semi-Supervised Variational Adversarial Active Learning via Learning to Rank and Agreement-Based Pseudo Labeling

Aug 23, 2024

Zongyao Lyu, William J. Beksi

Figure 1 for Semi-Supervised Variational Adversarial Active Learning via Learning to Rank and Agreement-Based Pseudo Labeling

Figure 2 for Semi-Supervised Variational Adversarial Active Learning via Learning to Rank and Agreement-Based Pseudo Labeling

Figure 3 for Semi-Supervised Variational Adversarial Active Learning via Learning to Rank and Agreement-Based Pseudo Labeling

Figure 4 for Semi-Supervised Variational Adversarial Active Learning via Learning to Rank and Agreement-Based Pseudo Labeling

Abstract:Active learning aims to alleviate the amount of labor involved in data labeling by automating the selection of unlabeled samples via an acquisition function. For example, variational adversarial active learning (VAAL) leverages an adversarial network to discriminate unlabeled samples from labeled ones using latent space information. However, VAAL has the following shortcomings: (i) it does not exploit target task information, and (ii) unlabeled data is only used for sample selection rather than model training. To address these limitations, we introduce novel techniques that significantly improve the use of abundant unlabeled data during training and take into account the task information. Concretely, we propose an improved pseudo-labeling algorithm that leverages information from all unlabeled data in a semi-supervised manner, thus allowing a model to explore a richer data space. In addition, we develop a ranking-based loss prediction module that converts predicted relative ranking information into a differentiable ranking loss. This loss can be embedded as a rank variable into the latent space of a variational autoencoder and then trained with a discriminator in an adversarial fashion for sample selection. We demonstrate the superior performance of our approach over the state of the art on various image classification and segmentation benchmark datasets.

* To be published in the 2024 International Conference on Pattern Recognition (ICPR)

Via

Access Paper or Ask Questions

Few-Shot Fruit Segmentation via Transfer Learning

May 04, 2024

Jordan A. James, Heather K. Manching, Amanda M. Hulse-Kemp, William J. Beksi

Abstract:Advancements in machine learning, computer vision, and robotics have paved the way for transformative solutions in various domains, particularly in agriculture. For example, accurate identification and segmentation of fruits from field images plays a crucial role in automating jobs such as harvesting, disease detection, and yield estimation. However, achieving robust and precise infield fruit segmentation remains a challenging task since large amounts of labeled data are required to handle variations in fruit size, shape, color, and occlusion. In this paper, we develop a few-shot semantic segmentation framework for infield fruits using transfer learning. Concretely, our work is aimed at addressing agricultural domains that lack publicly available labeled data. Motivated by similar success in urban scene parsing, we propose specialized pre-training using a public benchmark dataset for fruit transfer learning. By leveraging pre-trained neural networks, accurate semantic segmentation of fruit in the field is achieved with only a few labeled images. Furthermore, we show that models with pre-training learn to distinguish between fruit still on the trees and fruit that have fallen on the ground, and they can effectively transfer the knowledge to the target fruit dataset.

* To be published in the 2024 IEEE International Conference on Robotics and Automation (ICRA)

Via

Access Paper or Ask Questions

A Human-Centered Approach for Bootstrapping Causal Graph Creation

Mar 03, 2024

Minh Q. Tram, Nolan B. Gutierrez, William J. Beksi

Abstract:Causal inference, a cornerstone in disciplines such as economics, genomics, and medicine, is increasingly being recognized as fundamental to advancing the field of robotics. In particular, the ability to reason about cause and effect from observational data is crucial for robust generalization in robotic systems. However, the construction of a causal graphical model, a mechanism for representing causal relations, presents an immense challenge. Currently, a nuanced grasp of causal inference, coupled with an understanding of causal relationships, must be manually programmed into a causal graphical model. To address this difficulty, we present initial results towards a human-centered augmented reality framework for creating causal graphical models. Concretely, our system bootstraps the causal discovery process by involving humans in selecting variables, establishing relationships, performing interventions, generating counterfactual explanations, and evaluating the resulting causal graph at every step. We highlight the potential of our framework via a physical robot manipulator on a pick-and-place task.

* To be presented at the 2024 ACM/IEEE International Conference on Human-Robot Interaction (HRI) Workshop on Causal Learning for Human-Robot Interaction (Causal-HRI)

Via

Access Paper or Ask Questions

NTrack: A Multiple-Object Tracker and Dataset for Infield Cotton Boll Counting

Dec 18, 2023

Md Ahmed Al Muzaddid, William J. Beksi

Abstract:In agriculture, automating the accurate tracking of fruits, vegetables, and fiber is a very tough problem. The issue becomes extremely challenging in dynamic field environments. Yet, this information is critical for making day-to-day agricultural decisions, assisting breeding programs, and much more. To tackle this dilemma, we introduce NTrack, a novel multiple object tracking framework based on the linear relationship between the locations of neighboring tracks. NTrack computes dense optical flow and utilizes particle filtering to guide each tracker. Correspondences between detections and tracks are found through data association via direct observations and indirect cues, which are then combined to obtain an updated observation. Our modular multiple object tracking system is independent of the underlying detection method, thus allowing for the interchangeable use of any off-the-shelf object detector. We show the efficacy of our approach on the task of tracking and counting infield cotton bolls. Experimental results show that our system exceeds contemporary tracking and cotton boll-based counting methods by a large margin. Furthermore, we publicly release the first annotated cotton boll video dataset to the research community.

* To be published in IEEE Transactions on Automation Science and Engineering

Via

Access Paper or Ask Questions