Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yiming Zhou

Dy3DGS-SLAM: Monocular 3D Gaussian Splatting SLAM for Dynamic Environments

Jun 06, 2025

Mingrui Li, Yiming Zhou, Hongxing Zhou, Xinggang Hu, Florian Roemer, Hongyu Wang, Ahmad Osman

Abstract:Current Simultaneous Localization and Mapping (SLAM) methods based on Neural Radiance Fields (NeRF) or 3D Gaussian Splatting excel in reconstructing static 3D scenes but struggle with tracking and reconstruction in dynamic environments, such as real-world scenes with moving elements. Existing NeRF-based SLAM approaches addressing dynamic challenges typically rely on RGB-D inputs, with few methods accommodating pure RGB input. To overcome these limitations, we propose Dy3DGS-SLAM, the first 3D Gaussian Splatting (3DGS) SLAM method for dynamic scenes using monocular RGB input. To address dynamic interference, we fuse optical flow masks and depth masks through a probabilistic model to obtain a fused dynamic mask. With only a single network iteration, this can constrain tracking scales and refine rendered geometry. Based on the fused dynamic mask, we designed a novel motion loss to constrain the pose estimation network for tracking. In mapping, we use the rendering loss of dynamic pixels, color, and depth to eliminate transient interference and occlusion caused by dynamic objects. Experimental results demonstrate that Dy3DGS-SLAM achieves state-of-the-art tracking and rendering in dynamic environments, outperforming or matching existing RGB-D methods.

Via

Access Paper or Ask Questions

Evaluating Modern Approaches in 3D Scene Reconstruction: NeRF vs Gaussian-Based Methods

Aug 08, 2024

Yiming Zhou, Zixuan Zeng, Andi Chen, Xiaofan Zhou, Haowei Ni, Shiyao Zhang, Panfeng Li, Liangxi Liu, Mengyao Zheng, Xupeng Chen

Abstract:Exploring the capabilities of Neural Radiance Fields (NeRF) and Gaussian-based methods in the context of 3D scene reconstruction, this study contrasts these modern approaches with traditional Simultaneous Localization and Mapping (SLAM) systems. Utilizing datasets such as Replica and ScanNet, we assess performance based on tracking accuracy, mapping fidelity, and view synthesis. Findings reveal that NeRF excels in view synthesis, offering unique capabilities in generating new perspectives from existing data, albeit at slower processing speeds. Conversely, Gaussian-based methods provide rapid processing and significant expressiveness but lack comprehensive scene completion. Enhanced by global optimization and loop closure techniques, newer methods like NICE-SLAM and SplaTAM not only surpass older frameworks such as ORB-SLAM2 in terms of robustness but also demonstrate superior performance in dynamic and complex environments. This comparative analysis bridges theoretical research with practical implications, shedding light on future developments in robust 3D scene reconstruction across various real-world applications.

* Accepted by 2024 6th International Conference on Data-driven Optimization of Complex Systems

Via

Access Paper or Ask Questions

Nonlinear Schrödinger Network

Jul 19, 2024

Yiming Zhou, Callen MacPhee, Tingyi Zhou, Bahram Jalali

Figure 1 for Nonlinear Schrödinger Network

Figure 2 for Nonlinear Schrödinger Network

Figure 3 for Nonlinear Schrödinger Network

Figure 4 for Nonlinear Schrödinger Network

Abstract:Deep neural networks (DNNs) have achieved exceptional performance across various fields by learning complex nonlinear mappings from large-scale datasets. However, they encounter challenges such as high computational costs and limited interpretability. To address these issues, hybrid approaches that integrate physics with AI are gaining interest. This paper introduces a novel physics-based AI model called the "Nonlinear Schr\"odinger Network", which treats the Nonlinear Schr\"odinger Equation (NLSE) as a general-purpose trainable model for learning complex patterns including nonlinear mappings and memory effects from data. Existing physics-informed machine learning methods use neural networks to approximate the solutions of partial differential equations (PDEs). In contrast, our approach directly treats the PDE as a trainable model to obtain general nonlinear mappings that would otherwise require neural networks. As a physics-inspired approach, it offers a more interpretable and parameter-efficient alternative to traditional black-box neural networks, achieving comparable or better accuracy in time series classification tasks while significantly reducing the number of required parameters. Notably, the trained Nonlinear Schr\"odinger Network is interpretable, with all parameters having physical meanings as properties of a virtual physical system that transforms the data to a more separable space. This interpretability allows for insight into the underlying dynamics of the data transformation process. Applications to time series forecasting have also been explored. While our current implementation utilizes the NLSE, the proposed method of using physics equations as trainable models to learn nonlinear mappings from data is not limited to the NLSE and may be extended to other master equations of physics.

Via

Access Paper or Ask Questions

Structure Gaussian SLAM with Manhattan World Hypothesis

May 30, 2024

Shuhong Liu, Heng Zhou, Liuzhuozheng Li, Yun Liu, Tianchen Deng, Yiming Zhou, Mingrui Li

Abstract:Gaussian SLAM systems have made significant advancements in improving the efficiency and fidelity of real-time reconstructions. However, these systems often encounter incomplete reconstructions in complex indoor environments, characterized by substantial holes due to unobserved geometry caused by obstacles or limited view angles. To address this challenge, we present Manhattan Gaussian SLAM (MG-SLAM), an RGB-D system that leverages the Manhattan World hypothesis to enhance geometric accuracy and completeness. By seamlessly integrating fused line segments derived from structured scenes, MG-SLAM ensures robust tracking in textureless indoor areas. Moreover, The extracted lines and planar surface assumption allow strategic interpolation of new Gaussians in regions of missing geometry, enabling efficient scene completion. Extensive experiments conducted on both synthetic and real-world scenes demonstrate that these advancements enable our method to achieve state-of-the-art performance, marking a substantial improvement in the capabilities of Gaussian SLAM systems.

Via

Access Paper or Ask Questions

Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation

May 01, 2024

Zhenglin Li, Bo Guan, Yuanzhou Wei, Yiming Zhou, Jingyu Zhang, Jinxin Xu

Figure 1 for Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation

Figure 2 for Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation

Figure 3 for Mapping New Realities: Ground Truth Image Creation with Pix2Pix Image-to-Image Translation

Abstract:Generative Adversarial Networks (GANs) have significantly advanced image processing, with Pix2Pix being a notable framework for image-to-image translation. This paper explores a novel application of Pix2Pix to transform abstract map images into realistic ground truth images, addressing the scarcity of such images crucial for domains like urban planning and autonomous vehicle training. We detail the Pix2Pix model's utilization for generating high-fidelity datasets, supported by a dataset of paired map and aerial images, and enhanced by a tailored training regimen. The results demonstrate the model's capability to accurately render complex urban features, establishing its efficacy and potential for broad real-world applications.

Via

Access Paper or Ask Questions

Integrating AI in NDE: Techniques, Trends, and Further Directions

Apr 04, 2024

Eduardo Pérez, Cemil Emre Ardic, Ozan Çakıroğlu, Kevin Jacob, Sayako Kodera, Luca Pompa, Mohamad Rachid, Han Wang, Yiming Zhou, Cyril Zimmer(+2 more)

Figure 1 for Integrating AI in NDE: Techniques, Trends, and Further Directions

Figure 2 for Integrating AI in NDE: Techniques, Trends, and Further Directions

Figure 3 for Integrating AI in NDE: Techniques, Trends, and Further Directions

Figure 4 for Integrating AI in NDE: Techniques, Trends, and Further Directions

Abstract:The digital transformation is fundamentally changing our industries, affecting planning, execution as well as monitoring of production processes in a wide range of application fields. With product line-ups becoming more and more versatile and diverse, the necessary inspection and monitoring sparks significant novel requirements on the corresponding Nondestructive Evaluation (NDE) systems. The establishment of increasingly powerful approaches to incorporate Artificial Intelligence (AI) may provide just the needed innovation to solve some of these challenges. In this paper we provide a comprehensive survey about the usage of AI methods in NDE in light of the recent innovations towards NDE 4.0. Since we cannot discuss each NDE modality in one paper, we limit our attention to magnetic methods, ultrasound, thermography, as well as optical inspection. In addition to reviewing recent AI developments in each field, we draw common connections by pointing out NDE-related tasks that have a common underlying mathematical problem and categorizing the state of the art according to the corresponding sub-tasks. In so doing, interdisciplinary connections are drawn that provide a more complete overall picture.

Via

Access Paper or Ask Questions

Jointly Learning Selection Matrices For Transmitters, Receivers And Fourier Coefficients In Multichannel Imaging

Feb 29, 2024

Han Wang, Yiming Zhou, Eduardo Perez, Florian Roemer

Figure 1 for Jointly Learning Selection Matrices For Transmitters, Receivers And Fourier Coefficients In Multichannel Imaging

Figure 2 for Jointly Learning Selection Matrices For Transmitters, Receivers And Fourier Coefficients In Multichannel Imaging

Figure 3 for Jointly Learning Selection Matrices For Transmitters, Receivers And Fourier Coefficients In Multichannel Imaging

Figure 4 for Jointly Learning Selection Matrices For Transmitters, Receivers And Fourier Coefficients In Multichannel Imaging

Abstract:Strategic subsampling has become a focal point due to its effectiveness in compressing data, particularly in the Full Matrix Capture (FMC) approach in ultrasonic imaging. This paper introduces the Joint Deep Probabilistic Subsampling (J-DPS) method, which aims to learn optimal selection matrices simultaneously for transmitters, receivers, and Fourier coefficients. This task-based algorithm is realized by introducing a specialized measurement model and integrating a customized Complex Learned FISTA (CL-FISTA) network. We propose a parallel network architecture, partitioned into three segments corresponding to the three matrices, all working toward a shared optimization objective with adjustable loss allocation. A synthetic dataset is designed to reflect practical scenarios, and we provide quantitative comparisons with a traditional CRB-based algorithm, standard DPS, and J-DPS.

Via

Access Paper or Ask Questions

Proximal Dogleg Opportunistic Majorization for Nonconvex and Nonsmooth Optimization

Feb 29, 2024

Yiming Zhou, Wei Dai

Abstract:We consider minimizing a function consisting of a quadratic term and a proximable term which is possibly nonconvex and nonsmooth. This problem is also known as scaled proximal operator. Despite its simple form, existing methods suffer from slow convergence or high implementation complexity or both. To overcome these limitations, we develop a fast and user-friendly second-order proximal algorithm. Key innovation involves building and solving a series of opportunistically majorized problems along a hybrid Newton direction. The approach directly uses the precise Hessian of the quadratic term, and calculates the inverse only once, eliminating the iterative numerical approximation of the Hessian, a common practice in quasi-Newton methods. The algorithm's convergence to a critical point is established, and local convergence rate is derived based on the Kurdyka-Lojasiewicz property of the objective function. Numerical comparisons are conducted on well-known optimization problems. The results demonstrate that the proposed algorithm not only achieves a faster convergence but also tends to converge to a better local optimum compare to benchmark algorithms.

Via

Access Paper or Ask Questions

Communication-Efficient Distributed Learning with Local Immediate Error Compensation

Feb 19, 2024

Yifei Cheng, Li Shen, Linli Xu, Xun Qian, Shiwei Wu, Yiming Zhou, Tie Zhang, Dacheng Tao, Enhong Chen

Figure 1 for Communication-Efficient Distributed Learning with Local Immediate Error Compensation

Figure 2 for Communication-Efficient Distributed Learning with Local Immediate Error Compensation

Figure 3 for Communication-Efficient Distributed Learning with Local Immediate Error Compensation

Figure 4 for Communication-Efficient Distributed Learning with Local Immediate Error Compensation

Abstract:Gradient compression with error compensation has attracted significant attention with the target of reducing the heavy communication overhead in distributed learning. However, existing compression methods either perform only unidirectional compression in one iteration with higher communication cost, or bidirectional compression with slower convergence rate. In this work, we propose the Local Immediate Error Compensated SGD (LIEC-SGD) optimization algorithm to break the above bottlenecks based on bidirectional compression and carefully designed compensation approaches. Specifically, the bidirectional compression technique is to reduce the communication cost, and the compensation technique compensates the local compression error to the model update immediately while only maintaining the global error variable on the server throughout the iterations to boost its efficacy. Theoretically, we prove that LIEC-SGD is superior to previous works in either the convergence rate or the communication cost, which indicates that LIEC-SGD could inherit the dual advantages from unidirectional compression and bidirectional compression. Finally, experiments of training deep neural networks validate the effectiveness of the proposed LIEC-SGD algorithm.

Via

Access Paper or Ask Questions

Time Stretch with Continuous-Wave Lasers for Practical Fast Realtime Measurements

Sep 19, 2023

Tingyi Zhou, Yuta Goto, Takeshi Makino, Callen MacPhee, Yiming Zhou, Asad M. Madni, Hideaki Furukawa, Naoya Wada, Bahram Jalali

Figure 1 for Time Stretch with Continuous-Wave Lasers for Practical Fast Realtime Measurements

Figure 2 for Time Stretch with Continuous-Wave Lasers for Practical Fast Realtime Measurements

Figure 3 for Time Stretch with Continuous-Wave Lasers for Practical Fast Realtime Measurements

Figure 4 for Time Stretch with Continuous-Wave Lasers for Practical Fast Realtime Measurements

Abstract:Realtime high-throughput sensing and detection enables the capture of rare events within sub-picosecond time scale, which makes it possible for scientists to uncover the mystery of ultrafast physical processes. Photonic time stretch is one of the most successful approaches that utilize the ultra-wide bandwidth of mode-locked laser for detecting ultrafast signal. Though powerful, it relies on supercontinuum mode-locked laser source, which is expensive and difficult to integrate. This greatly limits the application of this technology. Here we propose a novel Continuous Wave (CW) implementation of the photonic time stretch. Instead of a supercontinuum mode-locked laser, a wavelength division multiplexed (WDM) CW laser, pulsed by electro-optic (EO) modulation, is adopted as the laser source. This opens up the possibility for low-cost integrated time stretch systems. This new approach is validated via both simulation and experiment. Two scenarios for potential application are also described.

Via

Access Paper or Ask Questions