Alert button
Picture for Bernd Pfrommer

Bernd Pfrommer

Alert button

Multi-view Tracking, Re-ID, and Social Network Analysis of a Flock of Visually Similar Birds in an Outdoor Aviary

Dec 01, 2022
Shiting Xiao, Yufu Wang, Ammon Perkes, Bernd Pfrommer, Marc Schmidt, Kostas Daniilidis, Marc Badger

Figure 1 for Multi-view Tracking, Re-ID, and Social Network Analysis of a Flock of Visually Similar Birds in an Outdoor Aviary
Figure 2 for Multi-view Tracking, Re-ID, and Social Network Analysis of a Flock of Visually Similar Birds in an Outdoor Aviary
Figure 3 for Multi-view Tracking, Re-ID, and Social Network Analysis of a Flock of Visually Similar Birds in an Outdoor Aviary
Figure 4 for Multi-view Tracking, Re-ID, and Social Network Analysis of a Flock of Visually Similar Birds in an Outdoor Aviary

The ability to capture detailed interactions among individuals in a social group is foundational to our study of animal behavior and neuroscience. Recent advances in deep learning and computer vision are driving rapid progress in methods that can record the actions and interactions of multiple individuals simultaneously. Many social species, such as birds, however, live deeply embedded in a three-dimensional world. This world introduces additional perceptual challenges such as occlusions, orientation-dependent appearance, large variation in apparent size, and poor sensor coverage for 3D reconstruction, that are not encountered by applications studying animals that move and interact only on 2D planes. Here we introduce a system for studying the behavioral dynamics of a group of songbirds as they move throughout a 3D aviary. We study the complexities that arise when tracking a group of closely interacting animals in three dimensions and introduce a novel dataset for evaluating multi-view trackers. Finally, we analyze captured ethogram data and demonstrate that social context affects the distribution of sequential interactions between birds in the aviary.

Viaarxiv icon

Frequency Cam: Imaging Periodic Signals in Real-Time

Nov 01, 2022
Bernd Pfrommer

Figure 1 for Frequency Cam: Imaging Periodic Signals in Real-Time
Figure 2 for Frequency Cam: Imaging Periodic Signals in Real-Time
Figure 3 for Frequency Cam: Imaging Periodic Signals in Real-Time
Figure 4 for Frequency Cam: Imaging Periodic Signals in Real-Time

Due to their high temporal resolution and large dynamic range event cameras are uniquely suited for the analysis of time-periodic signals in an image. In this work we present an efficient and fully asynchronous event camera algorithm for detecting the fundamental frequency at which image pixels flicker. The algorithm employs a second-order digital infinite impulse response (IIR) filter to perform an approximate per-pixel brightness reconstruction and is more robust to high-frequency noise than the baseline method we compare to. We further demonstrate that using the falling edge of the signal leads to more accurate period estimates than the rising edge, and that for certain signals interpolating the zero-level crossings can further increase accuracy. Our experiments find that the outstanding capabilities of the camera in detecting frequencies up to 64kHz for a single pixel do not carry over to full sensor imaging as readout bandwidth limitations become a serious obstacle. This suggests that a hardware implementation closer to the sensor will allow for greatly improved frequency imaging. We discuss the important design parameters for fullsensor frequency imaging and present Frequency Cam, an open-source implementation as a ROS node that can run on a single core of a laptop CPU at more than 50 million events per second. It produces results that are qualitatively very similar to those obtained from the closed source vibration analysis module in Prophesee's Metavision Toolkit. The code for Frequency Cam and a demonstration video can be found at https://github.com/berndpfrommer/frequency_cam

* 13 pages, 16 figures, one table 
Viaarxiv icon

TagSLAM: Robust SLAM with Fiducial Markers

Oct 01, 2019
Bernd Pfrommer, Kostas Daniilidis

Figure 1 for TagSLAM: Robust SLAM with Fiducial Markers
Figure 2 for TagSLAM: Robust SLAM with Fiducial Markers
Figure 3 for TagSLAM: Robust SLAM with Fiducial Markers
Figure 4 for TagSLAM: Robust SLAM with Fiducial Markers

TagSLAM provides a convenient, flexible, and robust way of performing Simultaneous Localization and Mapping (SLAM) with AprilTag fiducial markers. By leveraging a few simple abstractions (bodies, tags, cameras), TagSLAM provides a front end to the GTSAM factor graph optimizer that makes it possible to rapidly design a range of experiments that are based on tags: full SLAM, extrinsic camera calibration with non-overlapping views, visual localization for ground truth, loop closure for odometry, pose estimation etc. We discuss in detail how TagSLAM initializes the factor graph in a robust way, and present loop closure as an application example. TagSLAM is a ROS based open source package and can be found at https://berndpfrommer.github.io/tagslam_web.

Viaarxiv icon

Simultaneous Localization and Layout Model Selection in Manhattan Worlds

Dec 13, 2018
Armon Shariati, Bernd Pfrommer, Camillo J. Taylor

Figure 1 for Simultaneous Localization and Layout Model Selection in Manhattan Worlds
Figure 2 for Simultaneous Localization and Layout Model Selection in Manhattan Worlds
Figure 3 for Simultaneous Localization and Layout Model Selection in Manhattan Worlds
Figure 4 for Simultaneous Localization and Layout Model Selection in Manhattan Worlds

In this paper, we will demonstrate how Manhattan structure can be exploited to transform the Simultaneous Localization and Mapping (SLAM) problem, which is typically solved by a nonlinear optimization over feature positions, into a model selection problem solved by a convex optimization over higher order layout structures, namely walls, floors, and ceilings. Furthermore, we show how our novel formulation leads to an optimization procedure that automatically performs data association and loop closure and which ultimately produces the simplest model of the environment that is consistent with the available measurements. We verify our method on real world data sets collected with various sensing modalities.

Viaarxiv icon

Predictive and Semantic Layout Estimation for Robotic Applications in Manhattan Worlds

Nov 19, 2018
Armon Shariati, Bernd Pfrommer, Camillo J. Taylor

Figure 1 for Predictive and Semantic Layout Estimation for Robotic Applications in Manhattan Worlds
Figure 2 for Predictive and Semantic Layout Estimation for Robotic Applications in Manhattan Worlds
Figure 3 for Predictive and Semantic Layout Estimation for Robotic Applications in Manhattan Worlds
Figure 4 for Predictive and Semantic Layout Estimation for Robotic Applications in Manhattan Worlds

This paper describes an approach to automatically extracting floor plans from the kinds of incomplete measurements that could be acquired by an autonomous mobile robot. The approach proceeds by reasoning about extended structural layout surfaces which are automatically extracted from the available data. The scheme can be run in an online manner to build water tight representations of the environment. The system effectively speculates about room boundaries and free space regions which provides useful guidance to subsequent motion planning systems. Experimental results are presented on multiple data sets.

Viaarxiv icon

Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements

Sep 20, 2018
Shreyas S. Shivakumar, Kartik Mohta, Bernd Pfrommer, Vijay Kumar, Camillo J. Taylor

Figure 1 for Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements
Figure 2 for Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements
Figure 3 for Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements
Figure 4 for Real Time Dense Depth Estimation by Fusing Stereo with Sparse Depth Measurements

We present an approach to depth estimation that fuses information from a stereo pair with sparse range measurements derived from a LIDAR sensor or a range camera. The goal of this work is to exploit the complementary strengths of the two sensor modalities, the accurate but sparse range measurements and the ambiguous but dense stereo information. These two sources are effectively and efficiently fused by combining ideas from anisotropic diffusion and semi-global matching. We evaluate our approach on the KITTI 2015 and Middlebury 2014 datasets, using randomly sampled ground truth range measurements as our sparse depth input. We achieve significant performance improvements with a small fraction of range measurements on both datasets. We also provide qualitative results from our platform using the PMDTec Monstar sensor. Our entire pipeline runs on an NVIDIA TX-2 platform at 5Hz on 1280x1024 stereo images with 128 disparity levels.

* 7 pages, 5 figures, 2 tables 
Viaarxiv icon

The Open Vision Computer: An Integrated Sensing and Compute System for Mobile Robots

Sep 20, 2018
Morgan Quigley, Kartik Mohta, Shreyas S. Shivakumar, Michael Watterson, Yash Mulgaonkar, Mikael Arguedas, Ke Sun, Sikang Liu, Bernd Pfrommer, Vijay Kumar, Camillo J. Taylor

Figure 1 for The Open Vision Computer: An Integrated Sensing and Compute System for Mobile Robots
Figure 2 for The Open Vision Computer: An Integrated Sensing and Compute System for Mobile Robots
Figure 3 for The Open Vision Computer: An Integrated Sensing and Compute System for Mobile Robots
Figure 4 for The Open Vision Computer: An Integrated Sensing and Compute System for Mobile Robots

In this paper we describe the Open Vision Computer (OVC) which was designed to support high speed, vision guided autonomous drone flight. In particular our aim was to develop a system that would be suitable for relatively small-scale flying platforms where size, weight, power consumption and computational performance were all important considerations. This manuscript describes the primary features of our OVC system and explains how they are used to support fully autonomous indoor and outdoor exploration and navigation operations on our Falcon 250 quadrotor platform.

* 7 pages, 13 figures, conference 
Viaarxiv icon

Experiments in Fast, Autonomous, GPS-Denied Quadrotor Flight

Jun 19, 2018
Kartik Mohta, Ke Sun, Sikang Liu, Michael Watterson, Bernd Pfrommer, James Svacha, Yash Mulgaonkar, Camillo Jose Taylor, Vijay Kumar

Figure 1 for Experiments in Fast, Autonomous, GPS-Denied Quadrotor Flight
Figure 2 for Experiments in Fast, Autonomous, GPS-Denied Quadrotor Flight
Figure 3 for Experiments in Fast, Autonomous, GPS-Denied Quadrotor Flight
Figure 4 for Experiments in Fast, Autonomous, GPS-Denied Quadrotor Flight

High speed navigation through unknown environments is a challenging problem in robotics. It requires fast computation and tight integration of all the subsystems on the robot such that the latency in the perception-action loop is as small as possible. Aerial robots add a limitation of payload capacity, which restricts the amount of computation that can be carried onboard. This requires efficient algorithms for each component in the navigation system. In this paper, we describe our quadrotor system which is able to smoothly navigate through mixed indoor and outdoor environments and is able to fly at speeds of more than 18 m/s. We provide an overview of our system and details about the specific component technologies that enable the high speed navigation capability of our platform. We demonstrate the robustness of our system through high speed autonomous flights and navigation through a variety of obstacle rich environments.

* Accepted in ICRA 2018 
Viaarxiv icon

The Multi Vehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception

Feb 19, 2018
Alex Zihao Zhu, Dinesh Thakur, Tolga Ozaslan, Bernd Pfrommer, Vijay Kumar, Kostas Daniilidis

Figure 1 for The Multi Vehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception
Figure 2 for The Multi Vehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception
Figure 3 for The Multi Vehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception
Figure 4 for The Multi Vehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception

Event based cameras are a new passive sensing modality with a number of benefits over traditional cameras, including extremely low latency, asynchronous data acquisition, high dynamic range and very low power consumption. There has been a lot of recent interest and development in applying algorithms to use the events to perform a variety of 3D perception tasks, such as feature tracking, visual odometry, and stereo depth estimation. However, there currently lacks the wealth of labeled data that exists for traditional cameras to be used for both testing and development. In this paper, we present a large dataset with a synchronized stereo pair event based camera system, carried on a handheld rig, flown by a hexacopter, driven on top of a car and mounted on a motorcycle, in a variety of different illumination levels and environments. From each camera, we provide the event stream, grayscale images and IMU readings. In addition, we utilize a combination of IMU, a rigidly mounted lidar system, indoor and outdoor motion capture and GPS to provide accurate pose and depth images for each camera at up to 100Hz. For comparison, we also provide synchronized grayscale images and IMU readings from a frame based stereo camera system.

* 8 pages, 7 figures, 2 tables. Website: https://daniilidis-group.github.io/mvsec/. Video: https://www.youtube.com/watch?v=AwRMO5vFgak. Updated website and video in comments, DOI 
Viaarxiv icon