Abstract:Deep generative models have been studied and developed primarily in the context of natural images and computer vision. This has spurred the development of (Bayesian) methods that use these generative models for inverse problems in image restoration, such as denoising, inpainting, and super-resolution. In recent years, generative modeling for Bayesian inference on sensory data has also gained traction. Nevertheless, the direct application of generative modeling techniques initially designed for natural images on raw sensory data is not straightforward, requiring solutions that deal with high dynamic range signals acquired from multiple sensors or arrays of sensors that interfere with each other, and that typically acquire data at a very high rate. Moreover, the exact physical data-generating process is often complex or unknown. As a consequence, approximate models are used, resulting in discrepancies between model predictions and the observations that are non-Gaussian, in turn complicating the Bayesian inverse problem. Finally, sensor data is often used in real-time processing or decision-making systems, imposing stringent requirements on, e.g., latency and throughput. In this paper, we will discuss some of these challenges and offer approaches to address them, all in the context of high-rate real-time sensing applications in automotive radar and medical imaging.
Abstract:Diffusion models have quickly risen in popularity for their ability to model complex distributions and perform effective posterior sampling. Unfortunately, the iterative nature of these generative models makes them computationally expensive and unsuitable for real-time sequential inverse problems such as ultrasound imaging. Considering the strong temporal structure across sequences of frames, we propose a novel approach that models the transition dynamics to improve the efficiency of sequential diffusion posterior sampling in conditional image synthesis. Through modeling sequence data using a video vision transformer (ViViT) transition model based on previous diffusion outputs, we can initialize the reverse diffusion trajectory at a lower noise scale, greatly reducing the number of iterations required for convergence. We demonstrate the effectiveness of our approach on a real-world dataset of high frame rate cardiac ultrasound images and show that it achieves the same performance as a full diffusion trajectory while accelerating inference 25$\times$, enabling real-time posterior sampling. Furthermore, we show that the addition of a transition model improves the PSNR up to 8\% in cases with severe motion. Our method opens up new possibilities for real-time applications of diffusion models in imaging and other domains requiring real-time inference.
Abstract:Ultrasound images formed by delay-and-sum beamforming are plagued by artifacts that only clear up after compounding many transmissions. Some prior works pose imaging as an inverse problem. This approach can yield high image quality with few transmits, but requires a very fine image grid and is not robust to changes in measurement model parameters. We present INverse grid-Free Estimation of Reflectivities (INFER), an off-grid and stochastic algorithm that solves the inverse scattering problem in ultrasound imaging. Our method jointly optimizes for the locations of the gridpoints, their reflectivities, and the measurement model parameters such as the speed of sound. This approach allows us to use significantly fewer gridpoints, while obtaining better contrast and resolution and being more robust to changes in the imaging target and the hardware. The use of stochastic optimization enables solving for multiple transmissions simultaneously without increasing the required memory or computational load per iteration. We show that our method works across different imaging targets and across different transmit schemes and compares favorably against other beamforming and inverse solvers. The source code and the dataset to reproduce the results in this paper are available at www.github.com/vincentvdschaft/off-grid-ultrasound.
Abstract:This paper presents a new approach to the FNC-1 fake news classification task which involves employing pre-trained encoder models from similar NLP tasks, namely sentence similarity and natural language inference, and two neural network architectures using this approach are proposed. Methods in data augmentation are explored as a means of tackling class imbalance in the dataset, employing common pre-existing methods and proposing a method for sample generation in the under-represented class using a novel sentence negation algorithm. Comparable overall performance with existing baselines is achieved, while significantly increasing accuracy on an under-represented but nonetheless important class for FNC-1.