Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yufu Qu

Burst Image Super-Resolution via Multi-Cross Attention Encoding and Multi-Scan State-Space Decoding

May 26, 2025

Tengda Huang, Yu Zhang, Tianren Li, Yufu Qu, Fulin Liu, Zhenzhong Wei

Figure 1 for Burst Image Super-Resolution via Multi-Cross Attention Encoding and Multi-Scan State-Space Decoding

Figure 2 for Burst Image Super-Resolution via Multi-Cross Attention Encoding and Multi-Scan State-Space Decoding

Figure 3 for Burst Image Super-Resolution via Multi-Cross Attention Encoding and Multi-Scan State-Space Decoding

Figure 4 for Burst Image Super-Resolution via Multi-Cross Attention Encoding and Multi-Scan State-Space Decoding

Abstract:Multi-image super-resolution (MISR) can achieve higher image quality than single-image super-resolution (SISR) by aggregating sub-pixel information from multiple spatially shifted frames. Among MISR tasks, burst super-resolution (BurstSR) has gained significant attention due to its wide range of applications. Recent methods have increasingly adopted Transformers over convolutional neural networks (CNNs) in super-resolution tasks, due to their superior ability to capture both local and global context. However, most existing approaches still rely on fixed and narrow attention windows that restrict the perception of features beyond the local field. This limitation hampers alignment and feature aggregation, both of which are crucial for high-quality super-resolution. To address these limitations, we propose a novel feature extractor that incorporates two newly designed attention mechanisms: overlapping cross-window attention and cross-frame attention, enabling more precise and efficient extraction of sub-pixel information across multiple frames. Furthermore, we introduce a Multi-scan State-Space Module with the cross-frame attention mechanism to enhance feature aggregation. Extensive experiments on both synthetic and real-world benchmarks demonstrate the superiority of our approach. Additional evaluations on ISO 12233 resolution test charts further confirm its enhanced super-resolution performance.

* 32 pages, 13 figures, submitted to 'Image and Vision Computing'

Via

Access Paper or Ask Questions

RGBD-Glue: General Feature Combination for Robust RGB-D Point Cloud Registration

May 13, 2024

Congjia Chen, Xiaoyu Jia, Yanhong Zheng, Yufu Qu

Figure 1 for RGBD-Glue: General Feature Combination for Robust RGB-D Point Cloud Registration

Figure 2 for RGBD-Glue: General Feature Combination for Robust RGB-D Point Cloud Registration

Figure 3 for RGBD-Glue: General Feature Combination for Robust RGB-D Point Cloud Registration

Figure 4 for RGBD-Glue: General Feature Combination for Robust RGB-D Point Cloud Registration

Abstract:Point cloud registration is a fundamental task for estimating rigid transformations between point clouds. Previous studies have used geometric information for extracting features, matching and estimating transformation. Recently, owing to the advancement of RGB-D sensors, researchers have attempted to utilize visual information to improve registration performance. However, these studies focused on extracting distinctive features by deep feature fusion, which cannot effectively solve the negative effects of each feature's weakness, and cannot sufficiently leverage the valid information. In this paper, we propose a new feature combination framework, which applies a looser but more effective fusion and can achieve better performance. An explicit filter based on transformation consistency is designed for the combination framework, which can overcome each feature's weakness. And an adaptive threshold determined by the error distribution is proposed to extract more valid information from the two types of features. Owing to the distinctive design, our proposed framework can estimate more accurate correspondences and is applicable to both hand-crafted and learning-based feature descriptors. Experiments on ScanNet show that our method achieves a state-of-the-art performance and the rotation accuracy of 99.1%.

Via

Access Paper or Ask Questions