Abstract:Stereo vision between images faces a range of challenges, including occlusions, motion, and camera distortions, across applications in autonomous driving, robotics, and face analysis. Due to parameter sensitivity, further complications arise for stereo matching with sparse features, such as facial landmarks. To overcome this ill-posedness and enable unsupervised sparse matching, we consider line constraints of the camera geometry from an optimal transport (OT) viewpoint. Formulating camera-projected points as (half)lines, we propose the use of the classical epipolar distance as well as a 3D ray distance to quantify matching quality. Employing these distances as a cost function of a (partial) OT problem, we arrive at efficiently solvable assignment problems. Moreover, we extend our approach to unsupervised object matching by formulating it as a hierarchical OT problem. The resulting algorithms allow for efficient feature and object matching, as demonstrated in our numerical experiments. Here, we focus on applications in facial analysis, where we aim to match distinct landmarking conventions.
Abstract:We introduce a general framework for constructing generative models using one-dimensional noising processes. Beyond diffusion processes, we outline examples that demonstrate the flexibility of our approach. Motivated by this, we propose a novel framework in which the 1D processes themselves are learnable, achieved by parameterizing the noise distribution through quantile functions that adapt to the data. Our construction integrates seamlessly with standard objectives, including Flow Matching and consistency models. Learning quantile-based noise naturally captures heavy tails and compact supports when present. Numerical experiments highlight both the flexibility and the effectiveness of our method.




Abstract:The projected gradient descent (PGD) method has shown to be effective in recovering compressed signals described in a data-driven way by a generative model, i.e., a generator which has learned the data distribution. Further reconstruction improvements for such inverse problems can be achieved by conditioning the generator on the measurement. The boundary equilibrium generative adversarial network (BEGAN) implements an equilibrium based loss function and an auto-encoding discriminator to better balance the performance of the generator and the discriminator. In this work we investigate a network-based projected gradient descent (NPGD) algorithm for measurement-conditional generative models to solve the inverse problem much faster than regular PGD. We combine the NPGD with conditional GAN/BEGAN to evaluate their effectiveness in solving compressed sensing type problems. Our experiments on the MNIST and CelebA datasets show that the combination of measurement conditional model with NPGD works well in recovering the compressed signal while achieving similar or in some cases even better performance along with a much faster reconstruction. The achieved reconstruction speed-up in our experiments is up to 140-175.