Abstract:Minimally invasive and robot-assisted surgery relies heavily on endoscopic imaging, yet surgical smoke produced by electrocautery and vessel-sealing instruments can severely degrade visual perception and hinder vision-based functionalities. We present a transformer-based surgical desmoking model with a physics-inspired desmoking head that jointly predicts smoke-free image and corresponding smoke map. To address the scarcity of paired smoky-to-smoke-free training data, we develop a synthetic data generation pipeline that blends artificial smoke patterns with real endoscopic images, yielding over 80,000 paired samples for supervised training. We further curate, to our knowledge, the largest paired surgical smoke dataset to date, comprising 5,817 image pairs captured with the da Vinci robotic surgical system, enabling benchmarking on high-resolution endoscopic images. Extensive experiments on both a public benchmark and our dataset demonstrate state-of-the-art performance in image reconstruction compared to existing dehazing and desmoking approaches. We also assess the impact of desmoking on downstream stereo depth estimation and instrument segmentation, highlighting both the potential benefits and current limitations of digital smoke removal methods.




Abstract:Imaging through scattering media is a fundamental and pervasive challenge in fields ranging from medical diagnostics to astronomy. A promising strategy to overcome this challenge is wavefront modulation, which induces measurement diversity during image acquisition. Despite its importance, designing optimal wavefront modulations to image through scattering remains under-explored. This paper introduces a novel learning-based framework to address the gap. Our approach jointly optimizes wavefront modulations and a computationally lightweight feedforward "proxy" reconstruction network. This network is trained to recover scenes obscured by scattering, using measurements that are modified by these modulations. The learned modulations produced by our framework generalize effectively to unseen scattering scenarios and exhibit remarkable versatility. During deployment, the learned modulations can be decoupled from the proxy network to augment other more computationally expensive restoration algorithms. Through extensive experiments, we demonstrate our approach significantly advances the state of the art in imaging through scattering media. Our project webpage is at https://wavemo-2024.github.io/.



Abstract:Ultra High Frequency Ultrasound (UHFUS) enables the visualization of highly deformable small and medium vessels in the hand. Intricate vessel-based measurements, such as intimal wall thickness and vessel wall compliance, require sub-millimeter vessel tracking between B-scans. Our fast GPU-based approach combines the advantages of local phase analysis, a distance-regularized level set, and an Extended Kalman Filter (EKF), to rapidly segment and track the deforming vessel contour. We validated on 35 UHFUS sequences of vessels in the hand, and we show the transferability of the approach to 5 more diverse datasets acquired by a traditional High Frequency Ultrasound (HFUS) machine. To the best of our knowledge, this is the first algorithm capable of rapidly segmenting and tracking deformable vessel contours in 2D UHFUS images. It is also the fastest and most accurate system for 2D HFUS images.