Scene reconstruction in the presence of high-speed motion and low illumination is important in many applications such as augmented and virtual reality, drone navigation, and autonomous robotics. Traditional motion estimation techniques fail in such conditions, suffering from too much blur in the presence of high-speed motion and strong noise in low-light conditions. Single-photon cameras have recently emerged as a promising technology capable of capturing hundreds of thousands of photon frames per second thanks to their high speed and extreme sensitivity. Unfortunately, traditional computer vision techniques are not well suited for dealing with the binary-valued photon data captured by these cameras because these are corrupted by extreme Poisson noise. Here we present a method capable of estimating extreme scene motion under challenging conditions, such as low light or high dynamic range, from a sequence of high-speed image frames such as those captured by a single-photon camera. Our method relies on iteratively improving a motion estimate by grouping and aggregating frames after-the-fact, in a stratified manner. We demonstrate the creation of high-quality panoramas under fast motion and extremely low light, and super-resolution results using a custom single-photon camera prototype. For code and supplemental material see our $\href{https://wisionlab.com/project/panoramas-from-photons/}{\text{project webpage}}$.
Single-photon cameras (SPCs) have emerged as a promising technology for high-resolution 3D imaging. A single-photon 3D camera determines the round-trip time of a laser pulse by capturing the arrival of individual photons at each camera pixel. Constructing photon-timestamp histograms is a fundamental operation for a single-photon 3D camera. However, in-pixel histogram processing is computationally expensive and requires large amount of memory per pixel. Digitizing and transferring photon timestamps to an off-sensor histogramming module is bandwidth and power hungry. Here we present an online approach for distance estimation without explicitly storing photon counts. The two key ingredients of our approach are (a) processing photon streams using race logic, which maintains photon data in the time-delay domain, and (b) constructing count-free equi-depth histograms. Equi-depth histograms are a succinct representation for ``peaky'' distributions, such as those obtained by an SPC pixel from a laser pulse reflected by a surface. Our approach uses a binner element that converges on the median (or, more generally, to another quantile) of a distribution. We cascade multiple binners to form an equi-depth histogrammer that produces multi-bin histograms. Our evaluation shows that this method can provide an order of magnitude reduction in bandwidth and power consumption while maintaining similar distance reconstruction accuracy as conventional processing methods.
Time-resolved image sensors that capture light at pico-to-nanosecond timescales were once limited to niche applications but are now rapidly becoming mainstream in consumer devices. We propose low-cost and low-power imaging modalities that capture scene information from minimal time-resolved image sensors with as few as one pixel. The key idea is to flood illuminate large scene patches (or the entire scene) with a pulsed light source and measure the time-resolved reflected light by integrating over the entire illuminated area. The one-dimensional measured temporal waveform, called \emph{transient}, encodes both distances and albedoes at all visible scene points and as such is an aggregate proxy for the scene's 3D geometry. We explore the viability and limitations of the transient waveforms by themselves for recovering scene information, and also when combined with traditional RGB cameras. We show that plane estimation can be performed from a single transient and that using only a few more it is possible to recover a depth map of the whole scene. We also show two proof-of-concept hardware prototypes that demonstrate the feasibility of our approach for compact, mobile, and budget-limited applications.
Digital camera pixels measure image intensities by converting incident light energy into an analog electrical current, and then digitizing it into a fixed-width binary representation. This direct measurement method, while conceptually simple, suffers from limited dynamic range and poor performance under extreme illumination -- electronic noise dominates under low illumination, and pixel full-well capacity results in saturation under bright illumination. We propose a novel intensity cue based on measuring inter-photon timing, defined as the time delay between detection of successive photons. Based on the statistics of inter-photon times measured by a time-resolved single-photon sensor, we develop theory and algorithms for a scene brightness estimator which works over extreme dynamic range; we experimentally demonstrate imaging scenes with a dynamic range of over ten million to one. The proposed techniques, aided by the emergence of single-photon sensors such as single-photon avalanche diodes (SPADs) with picosecond timing resolution, will have implications for a wide range of imaging applications: robotics, consumer photography, astronomy, microscopy and biomedical imaging.
Single-photon avalanche diodes (SPADs) are becoming popular in time-of-flight depth-ranging due to their unique ability to capture individual photons with picosecond timing resolution. However, ambient light (e.g., sunlight) incident on a SPAD-based 3D camera leads to severe non-linear distortions (pileup) in the measured waveform, resulting in large depth errors. We propose asynchronous single-photon 3D imaging, a family of acquisition schemes to mitigate pileup during data acquisition itself. Asynchronous acquisition temporally misaligns SPAD measurement windows and the laser cycles through deterministically predefined or randomized offsets. Our key insight is that pileup distortions can be "averaged out" by choosing a sequence of offsets that span the entire depth range. We develop a generalized image formation model and perform theoretical analysis to explore the space of asynchronous acquisition schemes and design high-performance schemes. Our simulations and experiments demonstrate an improvement in depth accuracy of up to an order of magnitude as compared to the state-of-the-art, across a wide range of imaging scenarios, including those with high ambient flux.
Single photon avalanche diodes (SPADs) are starting to play a pivotal role in the development of photon-efficient, long-range LiDAR systems. However, due to non-linearities in their image formation model, a high photon flux (e.g., due to strong sunlight) leads to distortion of the incident temporal waveform, and potentially, large depth errors. Operating SPADs in low flux regimes can mitigate these distortions, but, often requires attenuating the signal and thus, results in low signal-to-noise ratio. In this paper, we address the following basic question: what is the optimal photon flux that a SPAD-based LiDAR should be operated in? We derive a closed form expression for the optimal flux, which is quasi-depth-invariant, and depends on the ambient light strength. The optimal flux is lower than what a SPAD typically measures in real world scenarios, but surprisingly, considerably higher than what is conventionally suggested for avoiding distortions. We propose a simple, adaptive approach for achieving the optimal flux by attenuating incident flux based on an estimate of ambient light strength. Using extensive simulations and a hardware prototype, we show that the optimal flux criterion holds for several depth estimators, under a wide range of illumination conditions.
Single-photon avalanche diodes (SPADs) are an emerging technology with a unique capability of capturing individual photons with high timing precision. SPADs are being used in several active imaging systems (e.g., fluorescence lifetime microscopy and LiDAR), albeit mostly limited to low photon flux settings. We propose passive free-running SPAD (PF-SPAD) imaging, an imaging modality that uses SPADs for capturing 2D intensity images with unprecedented dynamic range under ambient lighting, without any active light source. Our key observation is that the precise inter-photon timing measured by a SPAD can be used for estimating scene brightness under ambient lighting conditions, even for very bright scenes. We develop a theoretical model for PF-SPAD imaging, and derive a scene brightness estimator based on the average time of darkness between successive photons detected by a PF-SPAD pixel. Our key insight is that due to the stochastic nature of photon arrivals, this estimator does not suffer from a hard saturation limit. Coupled with high sensitivity at low flux, this enables a PF-SPAD pixel to measure a wide range of scene brightness, from very low to very high, thereby achieving extreme dynamic range. We demonstrate an improvement of over 2 orders of magnitude over conventional sensors by imaging scenes spanning a dynamic range of 10^6:1.