Alert button
Picture for Rui Zhou

Rui Zhou

Alert button

First realization of macroscopic Fourier ptychography for hundred-meter distance sub-diffraction imaging

Oct 23, 2023
Qi Zhang, Yuran Lu, Yinghui Guo, Yingjie Shang, Mingbo Pu, Yulong Fan, Rui Zhou, Xiaoyin Li, Fei Zhang, Mingfeng Xu, Xiangang Luo

Figure 1 for First realization of macroscopic Fourier ptychography for hundred-meter distance sub-diffraction imaging
Figure 2 for First realization of macroscopic Fourier ptychography for hundred-meter distance sub-diffraction imaging
Figure 3 for First realization of macroscopic Fourier ptychography for hundred-meter distance sub-diffraction imaging
Figure 4 for First realization of macroscopic Fourier ptychography for hundred-meter distance sub-diffraction imaging

Fourier ptychography (FP) imaging, drawing on the idea of synthetic aperture, has been demonstrated as a potential approach for remote sub-diffraction-limited imaging. Nevertheless, the farthest imaging distance is still limited around 10 m even though there has been a significant improvement in macroscopic FP. The most severely issue in increasing the imaging distance is FoV limitation caused by far-field condition for diffraction. Here, we propose to modify the Fourier far-field condition for rough reflective objects, aiming to overcome the small FoV limitation by using a divergent beam to illuminate objects. A joint optimization of pupil function and target image is utilized to attain the aberration-free image while estimating the pupil function simultaneously. Benefiting from the optimized reconstruction algorithm which effectively expands the camera's effective aperture, we experimentally implement several FP systems suited for imaging distance of 12 m, 90 m, and 170 m with the maximum synthetic aperture of 200 mm. The maximum imaging distance and synthetic aperture are thus improved by more than one order of magnitude of the state-of-the-art works with a fourfold improvement in the resolution. Our findings demonstrate significant potential for advancing the field of macroscopic FP, propelling it into a new stage of development.

Viaarxiv icon

Network Topology Inference with Sparsity and Laplacian Constraints

Sep 02, 2023
Jiaxi Ying, Xi Han, Rui Zhou, Xiwen Wang, Hing Cheung So

We tackle the network topology inference problem by utilizing Laplacian constrained Gaussian graphical models, which recast the task as estimating a precision matrix in the form of a graph Laplacian. Recent research \cite{ying2020nonconvex} has uncovered the limitations of the widely used $\ell_1$-norm in learning sparse graphs under this model: empirically, the number of nonzero entries in the solution grows with the regularization parameter of the $\ell_1$-norm; theoretically, a large regularization parameter leads to a fully connected (densest) graph. To overcome these challenges, we propose a graph Laplacian estimation method incorporating the $\ell_0$-norm constraint. An efficient gradient projection algorithm is developed to solve the resulting optimization problem, characterized by sparsity and Laplacian constraints. Through numerical experiments with synthetic and financial time-series datasets, we demonstrate the effectiveness of the proposed method in network topology inference.

Viaarxiv icon

Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning

Aug 22, 2023
Yun-Hin Chan, Rui Zhou, Running Zhao, Zhihan Jiang, Edith C. -H. Ngai

Figure 1 for Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning
Figure 2 for Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning
Figure 3 for Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning
Figure 4 for Internal Cross-layer Gradients for Extending Homogeneity to Heterogeneity in Federated Learning

Federated learning (FL) inevitably confronts the challenge of system heterogeneity in practical scenarios. To enhance the capabilities of most model-homogeneous FL methods in handling system heterogeneity, we propose a training scheme that can extend their capabilities to cope with this challenge. In this paper, we commence our study with a detailed exploration of homogeneous and heterogeneous FL settings and discover three key observations: (1) a positive correlation between client performance and layer similarities, (2) higher similarities in the shallow layers in contrast to the deep layers, and (3) the smoother gradients distributions indicate the higher layer similarities. Building upon these observations, we propose InCo Aggregation that leverags internal cross-layer gradients, a mixture of gradients from shallow and deep layers within a server model, to augment the similarity in the deep layers without requiring additional communication between clients. Furthermore, our methods can be tailored to accommodate model-homogeneous FL methods such as FedAvg, FedProx, FedNova, Scaffold, and MOON, to expand their capabilities to handle the system heterogeneity. Copious experimental results validate the effectiveness of InCo Aggregation, spotlighting internal cross-layer gradients as a promising avenue to enhance the performance in heterogenous FL.

* Preprint. Under review 
Viaarxiv icon

End-to-End Lane detection with One-to-Several Transformer

May 13, 2023
Kunyang Zhou, Rui Zhou

Figure 1 for End-to-End Lane detection with One-to-Several Transformer
Figure 2 for End-to-End Lane detection with One-to-Several Transformer
Figure 3 for End-to-End Lane detection with One-to-Several Transformer
Figure 4 for End-to-End Lane detection with One-to-Several Transformer

Although lane detection methods have shown impressive performance in real-world scenarios, most of methods require post-processing which is not robust enough. Therefore, end-to-end detectors like DEtection TRansformer(DETR) have been introduced in lane detection.However, one-to-one label assignment in DETR can degrade the training efficiency due to label semantic conflicts. Besides, positional query in DETR is unable to provide explicit positional prior, making it difficult to be optimized. In this paper, we present the One-to-Several Transformer(O2SFormer). We first propose the one-to-several label assignment, which combines one-to-many and one-to-one label assignment to solve label semantic conflicts while keeping end-to-end detection. To overcome the difficulty in optimizing one-to-one assignment. We further propose the layer-wise soft label which dynamically adjusts the positive weight of positive lane anchors in different decoder layers. Finally, we design the dynamic anchor-based positional query to explore positional prior by incorporating lane anchors into positional query. Experimental results show that O2SFormer with ResNet50 backbone achieves 77.83% F1 score on CULane dataset, outperforming existing Transformer-based and CNN-based detectors. Futhermore, O2SFormer converges 12.5x faster than DETR for the ResNet18 backbone.

* code: https://github.com/zkyseu/O2SFormer 
Viaarxiv icon

End to End Lane detection with One-to-Several Transformer

May 02, 2023
Kunyang Zhou, Rui Zhou

Figure 1 for End to End Lane detection with One-to-Several Transformer
Figure 2 for End to End Lane detection with One-to-Several Transformer
Figure 3 for End to End Lane detection with One-to-Several Transformer
Figure 4 for End to End Lane detection with One-to-Several Transformer

Although lane detection methods have shown impressive performance in real-world scenarios, most of methods require post-processing which is not robust enough. Therefore, end-to-end detectors like DEtection TRansformer(DETR) have been introduced in lane detection. However, one-to-one label assignment in DETR can degrade the training efficiency due to label semantic conflicts. Besides, positional query in DETR is unable to provide explicit positional prior, making it difficult to be optimized. In this paper, we present the One-to-Several Transformer(O2SFormer). We first propose the one-to-several label assignment, which combines one-to-one and one-to-many label assignments to improve the training efficiency while keeping end-to-end detection. To overcome the difficulty in optimizing one-to-one assignment. We further propose the layer-wise soft label which adjusts the positive weight of positive lane anchors across different decoder layers. Finally, we design the dynamic anchor-based positional query to explore positional prior by incorporating lane anchors into positional query. Experimental results show that O2SFormer significantly speeds up the convergence of DETR and outperforms Transformer-based and CNN-based detectors on the CULane dataset. Code will be available at https://github.com/zkyseu/O2SFormer.

* We fix some errors in the first version 
Viaarxiv icon

Multi-Exposure HDR Composition by Gated Swin Transformer

Mar 15, 2023
Rui Zhou, Yan Niu

Figure 1 for Multi-Exposure HDR Composition by Gated Swin Transformer
Figure 2 for Multi-Exposure HDR Composition by Gated Swin Transformer
Figure 3 for Multi-Exposure HDR Composition by Gated Swin Transformer
Figure 4 for Multi-Exposure HDR Composition by Gated Swin Transformer

Fusing a sequence of perfectly aligned images captured at various exposures, has shown great potential to approach High Dynamic Range (HDR) imaging by sensors with limited dynamic range. However, in the presence of large motion of scene objects or the camera, mis-alignment is almost inevitable and leads to the notorious ``ghost'' artifacts. Besides, factors such as the noise in the dark region or color saturation in the over-bright region may also fail to fill local image details to the HDR image. This paper provides a novel multi-exposure fusion model based on Swin Transformer. Particularly, we design feature selection gates, which are integrated with the feature extraction layers to detect outliers and block them from HDR image synthesis. To reconstruct the missing local details by well-aligned and properly-exposed regions, we exploit the long distance contextual dependency in the exposure-space pyramid by the self-attention mechanism. Extensive numerical and visual evaluation has been conducted on a variety of benchmark datasets. The experiments show that our model achieves the accuracy on par with current top performing multi-exposure HDR imaging models, while gaining higher efficiency.

* 7 pages, 4 figures 
Viaarxiv icon

Multi-modal Machine Learning in Engineering Design: A Review and Future Directions

Feb 14, 2023
Binyang Song, Rui Zhou, Faez Ahmed

Figure 1 for Multi-modal Machine Learning in Engineering Design: A Review and Future Directions
Figure 2 for Multi-modal Machine Learning in Engineering Design: A Review and Future Directions
Figure 3 for Multi-modal Machine Learning in Engineering Design: A Review and Future Directions
Figure 4 for Multi-modal Machine Learning in Engineering Design: A Review and Future Directions

Multi-modal machine learning (MMML), which involves integrating multiple modalities of data and their corresponding processing methods, has demonstrated promising results in various practical applications, such as text-to-image translation. This review paper summarizes the recent progress and challenges in using MMML for engineering design tasks. First, we introduce the different data modalities commonly used as design representations and involved in MMML, including text, 2D pixel data (e.g., images and sketches), and 3D shape data (e.g., voxels, point clouds, and meshes). We then provide an overview of the various approaches and techniques used for representing, fusing, aligning, synthesizing, and co-learning multi-modal data as five fundamental concepts of MMML. Next, we review the state-of-the-art capabilities of MMML that potentially apply to engineering design tasks, including design knowledge retrieval, design evaluation, and design synthesis. We also highlight the potential benefits and limitations of using MMML in these contexts. Finally, we discuss the challenges and future directions in using MMML for engineering design, such as the need for large labeled multi-modal design datasets, robust and scalable algorithms, integrating domain knowledge, and handling data heterogeneity and noise. Overall, this review paper provides a comprehensive overview of the current state and prospects of MMML for engineering design applications.

Viaarxiv icon

CAB: Empathetic Dialogue Generation with Cognition, Affection and Behavior

Feb 03, 2023
Pan Gao, Donghong Han, Rui Zhou, Xuejiao Zhang, Zikun Wang

Figure 1 for CAB: Empathetic Dialogue Generation with Cognition, Affection and Behavior
Figure 2 for CAB: Empathetic Dialogue Generation with Cognition, Affection and Behavior
Figure 3 for CAB: Empathetic Dialogue Generation with Cognition, Affection and Behavior
Figure 4 for CAB: Empathetic Dialogue Generation with Cognition, Affection and Behavior

Empathy is an important characteristic to be considered when building a more intelligent and humanized dialogue agent. However, existing methods did not fully comprehend empathy as a complex process involving three aspects: cognition, affection and behavior. In this paper, we propose CAB, a novel framework that takes a comprehensive perspective of cognition, affection and behavior to generate empathetic responses. For cognition, we build paths between critical keywords in the dialogue by leveraging external knowledge. This is because keywords in a dialogue are the core of sentences. Building the logic relationship between keywords, which is overlooked by the majority of existing works, can improve the understanding of keywords and contextual logic, thus enhance the cognitive ability. For affection, we capture the emotional dependencies with dual latent variables that contain both interlocutors' emotions. The reason is that considering both interlocutors' emotions simultaneously helps to learn the emotional dependencies. For behavior, we use appropriate dialogue acts to guide the dialogue generation to enhance the empathy expression. Extensive experiments demonstrate that our multi-perspective model outperforms the state-of-the-art models in both automatic and manual evaluation.

* accepted as a short paper at DASFAA 2023 
Viaarxiv icon

Online Learning Based Mobile Robot Controller Adaptation for Slip Reduction

Jan 30, 2023
Huidong Gao, Rui Zhou, Masayoshi Tomizuka, Zhuo Xu

Figure 1 for Online Learning Based Mobile Robot Controller Adaptation for Slip Reduction
Figure 2 for Online Learning Based Mobile Robot Controller Adaptation for Slip Reduction
Figure 3 for Online Learning Based Mobile Robot Controller Adaptation for Slip Reduction
Figure 4 for Online Learning Based Mobile Robot Controller Adaptation for Slip Reduction

Slip is a very common phenomena present in wheeled mobile robotic systems. It has undesirable consequences such as wasting energy and impeding system stability. To tackle the challenge of mobile robot trajectory tracking under slippery conditions, we propose a hierarchical framework that learns and adapts gains of the tracking controllers simultaneously online. Concretely, a reinforcement learning (RL) module is used to auto-tune parameters in a lateral predictive controller and a longitudinal speed PID controller. Experiments show the necessity of simultaneous gain tuning, and have demonstrated that our online framework outperforms the best baseline controller using fixed gains. By utilizing online gain adaptation, our framework achieves robust tracking performance by rejecting slip and reducing tracking errors when the mobile robot travels through various terrains.

Viaarxiv icon

A Data Quality Assessment Framework for AI-enabled Wireless Communication

Dec 13, 2022
Hanning Tang, Liusha Yang, Rui Zhou, Jing Liang, Hong Wei, Xuan Wang, Qingjiang Shi, Zhi-Quan Luo

Figure 1 for A Data Quality Assessment Framework for AI-enabled Wireless Communication
Figure 2 for A Data Quality Assessment Framework for AI-enabled Wireless Communication
Figure 3 for A Data Quality Assessment Framework for AI-enabled Wireless Communication
Figure 4 for A Data Quality Assessment Framework for AI-enabled Wireless Communication

Using artificial intelligent (AI) to re-design and enhance the current wireless communication system is a promising pathway for the future sixth-generation (6G) wireless network. The performance of AI-enabled wireless communication depends heavily on the quality of wireless air-interface data. Although there are various approaches to data quality assessment (DQA) for different applications, none has been designed for wireless air-interface data. In this paper, we propose a DQA framework to measure the quality of wireless air-interface data from three aspects: similarity, diversity, and completeness. The similarity measures how close the considered datasets are in terms of their statistical distributions; the diversity measures how well-rounded a dataset is, while the completeness measures to what degree the considered dataset satisfies the required performance metrics in an application scenario. The proposed framework can be applied to various types of wireless air-interface data, such as channel state information (CSI), signal-to-interference-plus-noise ratio (SINR), reference signal received power (RSRP), etc. For simplicity, the validity of our proposed DQA framework is corroborated by applying it to CSI data and using similarity and diversity metrics to improve CSI compression and recovery in Massive MIMO systems.

Viaarxiv icon