Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dorian Chan

Velox: Learning Representations of 4D Geometry and Appearance

May 06, 2026

Anagh Malik, Dorian Chan, Xiaoming Zhao, David B. Lindell, Oncel Tuzel, Jen-Hao Rick Chang

Abstract:We introduce a framework for learning latent representations of 4D objects which are descriptive, faithfully capturing object geometry and appearance; compressive, aiding in downstream efficiency; and accessible, requiring minimal input, i.e., an unstructured dynamic point cloud, to construct. Specifically, Velox trains an encoder to compress spatiotemporal color point clouds into a set of dynamic shape tokens. These tokens are supervised using two complementary decoders: a 4D surface decoder, which models the time-varying surface distribution capturing the geometry; and a Gaussian decoder, which maps the tokens to 3D Gaussians, helping learn appearance. To demonstrate the utility of our representation, we evaluate it across three downstream tasks -- video-to-4D generation, 3D tracking, and cloth simulation via image-to-4D generation -- and observe strong performances in all settings.

* CVPR 2026, Project page: https://apple.github.io/ml-velox

Via

Access Paper or Ask Questions

LiTo: Surface Light Field Tokenization

Mar 11, 2026

Jen-Hao Rick Chang, Xiaoming Zhao, Dorian Chan, Oncel Tuzel

Abstract:We propose a 3D latent representation that jointly models object geometry and view-dependent appearance. Most prior works focus on either reconstructing 3D geometry or predicting view-independent diffuse appearance, and thus struggle to capture realistic view-dependent effects. Our approach leverages that RGB-depth images provide samples of a surface light field. By encoding random subsamples of this surface light field into a compact set of latent vectors, our model learns to represent both geometry and appearance within a unified 3D latent space. This representation reproduces view-dependent effects such as specular highlights and Fresnel reflections under complex lighting. We further train a latent flow matching model on this representation to learn its distribution conditioned on a single input image, enabling the generation of 3D objects with appearances consistent with the lighting and materials in the input. Experiments show that our approach achieves higher visual quality and better input fidelity than existing methods.

* ICLR 2026; Project page: https://apple.github.io/ml-lito/

Via

Access Paper or Ask Questions

Efficient Non-Line-of-Sight Imaging from Transient Sinograms

Aug 06, 2020

Mariko Isogawa, Dorian Chan, Ye Yuan, Kris Kitani, Matthew O'Toole

Figure 1 for Efficient Non-Line-of-Sight Imaging from Transient Sinograms

Figure 2 for Efficient Non-Line-of-Sight Imaging from Transient Sinograms

Figure 3 for Efficient Non-Line-of-Sight Imaging from Transient Sinograms

Figure 4 for Efficient Non-Line-of-Sight Imaging from Transient Sinograms

Abstract:Non-line-of-sight (NLOS) imaging techniques use light that diffusely reflects off of visible surfaces (e.g., walls) to see around corners. One approach involves using pulsed lasers and ultrafast sensors to measure the travel time of multiply scattered light. Unlike existing NLOS techniques that generally require densely raster scanning points across the entirety of a relay wall, we explore a more efficient form of NLOS scanning that reduces both acquisition times and computational requirements. We propose a circular and confocal non-line-of-sight (C2NLOS) scan that involves illuminating and imaging a common point, and scanning this point in a circular path along a wall. We observe that (1) these C2NLOS measurements consist of a superposition of sinusoids, which we refer to as a transient sinogram, (2) there exists computationally efficient reconstruction procedures that transform these sinusoidal measurements into 3D positions of hidden scatterers or NLOS images of hidden objects, and (3) despite operating on an order of magnitude fewer measurements than previous approaches, these C2NLOS scans provide sufficient information about the hidden scene to solve these different NLOS imaging tasks. We show results from both simulated and real C2NLOS scans.

* ECCV 2020. Project page: https://marikoisogawa.github.io/project/c2nlos

Via

Access Paper or Ask Questions