Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Chunyu Lin

Revisiting 360 Depth Estimation with PanoGabor: A New Fusion Perspective

Aug 30, 2024

Zhijie Shen, Chunyu Lin, Lang Nie, Kang Liao

Figure 1 for Revisiting 360 Depth Estimation with PanoGabor: A New Fusion Perspective

Figure 2 for Revisiting 360 Depth Estimation with PanoGabor: A New Fusion Perspective

Figure 3 for Revisiting 360 Depth Estimation with PanoGabor: A New Fusion Perspective

Figure 4 for Revisiting 360 Depth Estimation with PanoGabor: A New Fusion Perspective

Abstract:Depth estimation from a monocular 360 image is important to the perception of the entire 3D environment. However, the inherent distortion and large field of view (FoV) in 360 images pose great challenges for this task. To this end, existing mainstream solutions typically introduce additional perspective-based 360 representations (\textit{e.g.}, Cubemap) to achieve effective feature extraction. Nevertheless, regardless of the introduced representations, they eventually need to be unified into the equirectangular projection (ERP) format for the subsequent depth estimation, which inevitably reintroduces the troublesome distortions. In this work, we propose an oriented distortion-aware Gabor Fusion framework (PGFuse) to address the above challenges. First, we introduce Gabor filters that analyze texture in the frequency domain, thereby extending the receptive fields and enhancing depth cues. To address the reintroduced distortions, we design a linear latitude-aware distortion representation method to generate customized, distortion-aware Gabor filters (PanoGabor filters). Furthermore, we design a channel-wise and spatial-wise unidirectional fusion module (CS-UFM) that integrates the proposed PanoGabor filters to unify other representations into the ERP format, delivering effective and distortion-free features. Considering the orientation sensitivity of the Gabor transform, we introduce a spherical gradient constraint to stabilize this sensitivity. Experimental results on three popular indoor 360 benchmarks demonstrate the superiority of the proposed PGFuse to existing state-of-the-art solutions. Code can be available upon acceptance.

Via

Access Paper or Ask Questions

Str-L Pose: Integrating Point and Structured Line for Relative Pose Estimation in Dual-Graph

Aug 28, 2024

Zherong Zhang, Chunyu Lin, Shujuan Huang, Shangrong Yang, Yao Zhao

Figure 1 for Str-L Pose: Integrating Point and Structured Line for Relative Pose Estimation in Dual-Graph

Figure 2 for Str-L Pose: Integrating Point and Structured Line for Relative Pose Estimation in Dual-Graph

Figure 3 for Str-L Pose: Integrating Point and Structured Line for Relative Pose Estimation in Dual-Graph

Figure 4 for Str-L Pose: Integrating Point and Structured Line for Relative Pose Estimation in Dual-Graph

Abstract:Relative pose estimation is crucial for various computer vision applications, including Robotic and Autonomous Driving. Current methods primarily depend on selecting and matching feature points prone to incorrect matches, leading to poor performance. Consequently, relying solely on point-matching relationships for pose estimation is a huge challenge. To overcome these limitations, we propose a Geometric Correspondence Graph neural network that integrates point features with extra structured line segments. This integration of matched points and line segments further exploits the geometry constraints and enhances model performance across different environments. We employ the Dual-Graph module and Feature Weighted Fusion Module to aggregate geometric and visual features effectively, facilitating complex scene understanding. We demonstrate our approach through extensive experiments on the DeMoN and KITTI Odometry datasets. The results show that our method is competitive with state-of-the-art techniques.

Via

Access Paper or Ask Questions

SGFormer: Spherical Geometry Transformer for 360 Depth Estimation

Apr 23, 2024

Junsong Zhang, Zisong Chen, Chunyu Lin, Lang Nie, Zhijie Shen, Junda Huang, Yao Zhao

Figure 1 for SGFormer: Spherical Geometry Transformer for 360 Depth Estimation

Figure 2 for SGFormer: Spherical Geometry Transformer for 360 Depth Estimation

Figure 3 for SGFormer: Spherical Geometry Transformer for 360 Depth Estimation

Figure 4 for SGFormer: Spherical Geometry Transformer for 360 Depth Estimation

Abstract:Panoramic distortion poses a significant challenge in 360 depth estimation, particularly pronounced at the north and south poles. Existing methods either adopt a bi-projection fusion strategy to remove distortions or model long-range dependencies to capture global structures, which can result in either unclear structure or insufficient local perception. In this paper, we propose a spherical geometry transformer, named SGFormer, to address the above issues, with an innovative step to integrate spherical geometric priors into vision transformers. To this end, we retarget the transformer decoder to a spherical prior decoder (termed SPDecoder), which endeavors to uphold the integrity of spherical structures during decoding. Concretely, we leverage bipolar re-projection, circular rotation, and curve local embedding to preserve the spherical characteristics of equidistortion, continuity, and surface distance, respectively. Furthermore, we present a query-based global conditional position embedding to compensate for spatial structure at varying resolutions. It not only boosts the global perception of spatial position but also sharpens the depth structure across different patches. Finally, we conduct extensive experiments on popular benchmarks, demonstrating our superiority over state-of-the-art solutions.

Via

Access Paper or Ask Questions

Digging into contrastive learning for robust depth estimation with diffusion models

Apr 17, 2024

Jiyuan Wang, Chunyu Lin, Lang Nie, Kang Liao, Shuwei Shao, Yao Zhao

Figure 1 for Digging into contrastive learning for robust depth estimation with diffusion models

Figure 2 for Digging into contrastive learning for robust depth estimation with diffusion models

Figure 3 for Digging into contrastive learning for robust depth estimation with diffusion models

Figure 4 for Digging into contrastive learning for robust depth estimation with diffusion models

Abstract:Recently, diffusion-based depth estimation methods have drawn widespread attention due to their elegant denoising patterns and promising performance. However, they are typically unreliable under adverse conditions prevalent in real-world scenarios, such as rainy, snowy, etc. In this paper, we propose a novel robust depth estimation method called D4RD, featuring a custom contrastive learning mode tailored for diffusion models to mitigate performance degradation in complex environments. Concretely, we integrate the strength of knowledge distillation into contrastive learning, building the `trinity' contrastive scheme. This scheme utilizes the sampled noise of the forward diffusion process as a natural reference, guiding the predicted noise in diverse scenes toward a more stable and precise optimum. Moreover, we extend noise-level trinity to encompass more generic feature and image levels, establishing a multi-level contrast to distribute the burden of robust perception across the overall network. Before addressing complex scenarios, we enhance the stability of the baseline diffusion model with three straightforward yet effective improvements, which facilitate convergence and remove depth outliers. Extensive experiments demonstrate that D4RD surpasses existing state-of-the-art solutions on synthetic corruption datasets and real-world weather conditions. The code for D4RD will be made available for further exploration and adoption.

* 8 pages,6 figures

Via

Access Paper or Ask Questions

Eliminating Warping Shakes for Unsupervised Online Video Stitching

Mar 11, 2024

Lang Nie, Chunyu Lin, Kang Liao, Yun Zhang, Shuaicheng Liu, Yao Zhao

Figure 1 for Eliminating Warping Shakes for Unsupervised Online Video Stitching

Figure 2 for Eliminating Warping Shakes for Unsupervised Online Video Stitching

Figure 3 for Eliminating Warping Shakes for Unsupervised Online Video Stitching

Figure 4 for Eliminating Warping Shakes for Unsupervised Online Video Stitching

Abstract:In this paper, we retarget video stitching to an emerging issue, named warping shake, when extending image stitching to video stitching. It unveils the temporal instability of warped content in non-overlapping regions, despite image stitching having endeavored to preserve the natural structures. Therefore, in most cases, even if the input videos to be stitched are stable, the stitched video will inevitably cause undesired warping shakes and affect the visual experience. To eliminate the shakes, we propose StabStitch to simultaneously realize video stitching and video stabilization in a unified unsupervised learning framework. Starting from the camera paths in video stabilization, we first derive the expression of stitching trajectories in video stitching by elaborately integrating spatial and temporal warps. Then a warp smoothing model is presented to optimize them with a comprehensive consideration regarding content alignment, trajectory smoothness, spatial consistency, and online collaboration. To establish an evaluation benchmark and train the learning framework, we build a video stitching dataset with a rich diversity in camera motions and scenes. Compared with existing stitching solutions, StabStitch exhibits significant superiority in scene robustness and inference speed in addition to stitching and stabilization performance, contributing to a robust and real-time online video stitching system. The code and dataset will be available at https://github.com/nie-lang/StabStitch.

Via

Access Paper or Ask Questions

Semi-Supervised Coupled Thin-Plate Spline Model for Rotation Correction and Beyond

Jan 24, 2024

Lang Nie, Chunyu Lin, Kang Liao, Shuaicheng Liu, Yao Zhao

Figure 1 for Semi-Supervised Coupled Thin-Plate Spline Model for Rotation Correction and Beyond

Figure 2 for Semi-Supervised Coupled Thin-Plate Spline Model for Rotation Correction and Beyond

Figure 3 for Semi-Supervised Coupled Thin-Plate Spline Model for Rotation Correction and Beyond

Figure 4 for Semi-Supervised Coupled Thin-Plate Spline Model for Rotation Correction and Beyond

Abstract:Thin-plate spline (TPS) is a principal warp that allows for representing elastic, nonlinear transformation with control point motions. With the increase of control points, the warp becomes increasingly flexible but usually encounters a bottleneck caused by undesired issues, e.g., content distortion. In this paper, we explore generic applications of TPS in single-image-based warping tasks, such as rotation correction, rectangling, and portrait correction. To break this bottleneck, we propose the coupled thin-plate spline model (CoupledTPS), which iteratively couples multiple TPS with limited control points into a more flexible and powerful transformation. Concretely, we first design an iterative search to predict new control points according to the current latent condition. Then, we present the warping flow as a bridge for the coupling of different TPS transformations, effectively eliminating interpolation errors caused by multiple warps. Besides, in light of the laborious annotation cost, we develop a semi-supervised learning scheme to improve warping quality by exploiting unlabeled data. It is formulated through dual transformation between the searched control points of unlabeled data and its graphic augmentation, yielding an implicit correction consistency constraint. Finally, we collect massive unlabeled data to exhibit the benefit of our semi-supervised scheme in rotation correction. Extensive experiments demonstrate the superiority and universality of CoupledTPS over the existing state-of-the-art (SoTA) solutions for rotation correction and beyond. The code and data will be available at https://github.com/nie-lang/CoupledTPS.

Via

Access Paper or Ask Questions

360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception

Dec 26, 2023

Zhijie Shen, Chunyu Lin, Junsong Zhang, Lang Nie, Kang Liao, Yao Zhao

Figure 1 for 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception

Figure 2 for 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception

Figure 3 for 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception

Figure 4 for 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception

Abstract:Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results as the compression process often muddles the semantics between various planes. Besides, these data-driven approaches impose an urgent demand for massive data annotations, which are laborious and time-consuming. For the first problem, we propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics. DOPNet consists of three modules that are integrated to deliver distortion-free, semantics-clean, and detail-sharp disentangled representations, which benefit the subsequent layout recovery. For the second problem, we present an unsupervised adaptation technique tailored for horizon-depth and ratio representations. Concretely, we introduce an optimization strategy for decision-level layout analysis and a 1D cost volume construction method for feature-level multi-view aggregation, both of which are designed to fully exploit the geometric consistency across multiple perspectives. The optimizer provides a reliable set of pseudo-labels for network training, while the 1D cost volume enriches each view with comprehensive scene information derived from other perspectives. Extensive experiments demonstrate that our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.

* arXiv admin note: substantial text overlap with arXiv:2303.00971

Via

Access Paper or Ask Questions

WeatherDepth: Curriculum Contrastive Learning for Self-Supervised Depth Estimation under Adverse Weather Conditions

Oct 09, 2023

Jiyuan Wang, Chunyu Lin, Lang Nie, Shujun Huang, Yao Zhao, Xing Pan, Rui Ai

Figure 1 for WeatherDepth: Curriculum Contrastive Learning for Self-Supervised Depth Estimation under Adverse Weather Conditions

Figure 2 for WeatherDepth: Curriculum Contrastive Learning for Self-Supervised Depth Estimation under Adverse Weather Conditions

Figure 3 for WeatherDepth: Curriculum Contrastive Learning for Self-Supervised Depth Estimation under Adverse Weather Conditions

Figure 4 for WeatherDepth: Curriculum Contrastive Learning for Self-Supervised Depth Estimation under Adverse Weather Conditions

Abstract:Depth estimation models have shown promising performance on clear scenes but fail to generalize to adverse weather conditions due to illumination variations, weather particles, etc. In this paper, we propose WeatherDepth, a self-supervised robust depth estimation model with curriculum contrastive learning, to tackle performance degradation in complex weather conditions. Concretely, we first present a progressive curriculum learning scheme with three simple-to-complex curricula to gradually adapt the model from clear to relative adverse, and then to adverse weather scenes. It encourages the model to gradually grasp beneficial depth cues against the weather effect, yielding smoother and better domain adaption. Meanwhile, to prevent the model from forgetting previous curricula, we integrate contrastive learning into different curricula. Drawn the reference knowledge from the previous course, our strategy establishes a depth consistency constraint between different courses towards robust depth estimation in diverse weather. Besides, to reduce manual intervention and better adapt to different models, we designed an adaptive curriculum scheduler to automatically search for the best timing for course switching. In the experiment, the proposed solution is proven to be easily incorporated into various architectures and demonstrates state-of-the-art (SoTA) performance on both synthetic and real weather datasets.

Via

Access Paper or Ask Questions

GAFlow: Incorporating Gaussian Attention into Optical Flow

Sep 28, 2023

Ao Luo, Fan Yang, Xin Li, Lang Nie, Chunyu Lin, Haoqiang Fan, Shuaicheng Liu

Abstract:Optical flow, or the estimation of motion fields from image sequences, is one of the fundamental problems in computer vision. Unlike most pixel-wise tasks that aim at achieving consistent representations of the same category, optical flow raises extra demands for obtaining local discrimination and smoothness, which yet is not fully explored by existing approaches. In this paper, we push Gaussian Attention (GA) into the optical flow models to accentuate local properties during representation learning and enforce the motion affinity during matching. Specifically, we introduce a novel Gaussian-Constrained Layer (GCL) which can be easily plugged into existing Transformer blocks to highlight the local neighborhood that contains fine-grained structural information. Moreover, for reliable motion analysis, we provide a new Gaussian-Guided Attention Module (GGAM) which not only inherits properties from Gaussian distribution to instinctively revolve around the neighbor fields of each point but also is empowered to put the emphasis on contextually related regions during matching. Our fully-equipped model, namely Gaussian Attention Flow network (GAFlow), naturally incorporates a series of novel Gaussian-based modules into the conventional optical flow framework for reliable motion analysis. Extensive experiments on standard optical flow datasets consistently demonstrate the exceptional performance of the proposed approach in terms of both generalization ability evaluation and online benchmark testing. Code is available at https://github.com/LA30/GAFlow.

* To appear in ICCV-2023

Via

Access Paper or Ask Questions

Deep Learning for Camera Calibration and Beyond: A Survey

Mar 19, 2023

Kang Liao, Lang Nie, Shujuan Huang, Chunyu Lin, Jing Zhang, Yao Zhao, Moncef Gabbouj, Dacheng Tao

Figure 1 for Deep Learning for Camera Calibration and Beyond: A Survey

Figure 2 for Deep Learning for Camera Calibration and Beyond: A Survey

Figure 3 for Deep Learning for Camera Calibration and Beyond: A Survey

Figure 4 for Deep Learning for Camera Calibration and Beyond: A Survey

Abstract:Camera calibration involves estimating camera parameters to infer geometric features from captured sequences, which is crucial for computer vision and robotics. However, conventional calibration is laborious and requires dedicated collection. Recent efforts show that learning-based solutions have the potential to be used in place of the repeatability works of manual calibrations. Among these solutions, various learning strategies, networks, geometric priors, and datasets have been investigated. In this paper, we provide a comprehensive survey of learning-based camera calibration techniques, by analyzing their strengths and limitations. Our main calibration categories include the standard pinhole camera model, distortion camera model, cross-view model, and cross-sensor model, following the research trend and extended applications. As there is no benchmark in this community, we collect a holistic calibration dataset that can serve as a public platform to evaluate the generalization of existing methods. It comprises both synthetic and real-world data, with images and videos captured by different cameras in diverse scenes. Toward the end of this paper, we discuss the challenges and provide further research directions. To our knowledge, this is the first survey for the learning-based camera calibration (spanned 8 years). The summarized methods, datasets, and benchmarks are available and will be regularly updated at https://github.com/KangLiao929/Awesome-Deep-Camera-Calibration.

* Github repository: https://github.com/KangLiao929/Awesome-Deep-Camera-Calibration

Via

Access Paper or Ask Questions