Alert button
Picture for Yuhua Chen

Yuhua Chen

Alert button

A Phase-Coded Time-Domain Interleaved OTFS Waveform with Improved Ambiguity Function

Jul 26, 2023
Jiajun Zhu, Yanqun Tang, Chao Yang, Chi Zhang, Haoran Yin, Jiaojiao Xiong, Yuhua Chen

Figure 1 for A Phase-Coded Time-Domain Interleaved OTFS Waveform with Improved Ambiguity Function
Figure 2 for A Phase-Coded Time-Domain Interleaved OTFS Waveform with Improved Ambiguity Function
Figure 3 for A Phase-Coded Time-Domain Interleaved OTFS Waveform with Improved Ambiguity Function
Figure 4 for A Phase-Coded Time-Domain Interleaved OTFS Waveform with Improved Ambiguity Function

Integrated sensing and communication (ISAC) is a significant application scenario in future wireless communication networks, and sensing is always evaluated by the ambiguity function. To enhance the sensing performance of the orthogonal time frequency space (OTFS) waveform, we propose a novel time-domain interleaved cyclic-shifted P4-coded OTFS (TICP4-OTFS) with improved ambiguity function. TICP4-OTFS can achieve superior autocorrelation features in both the time and frequency domains by exploiting the multicarrier-like form of OTFS after interleaved and the favorable autocorrelation attributes of the P4 code. Furthermore, we present the vectorized formulation of TICP4-OTFS modulation as well as its signal structure in each domain. Numerical simulations show that our proposed TICP4-OTFS waveform outperforms OTFS with a narrower mainlobe as well as lower and more distant sidelobes in terms of delay and Doppler-dimensional ambiguity functions, and an instance of range estimation using pulse compression is illustrated to exhibit the proposed waveform\u2019s greater resolution. Besides, TICP4-OTFS achieves better performance of bit error rate for communication in low signal-to-noise ratio (SNR) scenarios.

Viaarxiv icon

Data-Consistent Non-Cartesian Deep Subspace Learning for Efficient Dynamic MR Image Reconstruction

May 03, 2022
Zihao Chen, Yuhua Chen, Yibin Xie, Debiao Li, Anthony G. Christodoulou

Figure 1 for Data-Consistent Non-Cartesian Deep Subspace Learning for Efficient Dynamic MR Image Reconstruction
Figure 2 for Data-Consistent Non-Cartesian Deep Subspace Learning for Efficient Dynamic MR Image Reconstruction
Figure 3 for Data-Consistent Non-Cartesian Deep Subspace Learning for Efficient Dynamic MR Image Reconstruction
Figure 4 for Data-Consistent Non-Cartesian Deep Subspace Learning for Efficient Dynamic MR Image Reconstruction

Non-Cartesian sampling with subspace-constrained image reconstruction is a popular approach to dynamic MRI, but slow iterative reconstruction limits its clinical application. Data-consistent (DC) deep learning can accelerate reconstruction with good image quality, but has not been formulated for non-Cartesian subspace imaging. In this study, we propose a DC non-Cartesian deep subspace learning framework for fast, accurate dynamic MR image reconstruction. Four novel DC formulations are developed and evaluated: two gradient decent approaches, a directly solved approach, and a conjugate gradient approach. We applied a U-Net model with and without DC layers to reconstruct T1-weighted images for cardiac MR Multitasking (an advanced multidimensional imaging method), comparing our results to the iteratively reconstructed reference. Experimental results show that the proposed framework significantly improves reconstruction accuracy over the U-Net model without DC, while significantly accelerating the reconstruction over conventional iterative reconstruction.

* Accepted by IEEE ISBI 2022 
Viaarxiv icon

Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with Self-Supervised Depth Estimation

Aug 28, 2021
Lukas Hoyer, Dengxin Dai, Qin Wang, Yuhua Chen, Luc Van Gool

Figure 1 for Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with Self-Supervised Depth Estimation
Figure 2 for Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with Self-Supervised Depth Estimation
Figure 3 for Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with Self-Supervised Depth Estimation
Figure 4 for Improving Semi-Supervised and Domain-Adaptive Semantic Segmentation with Self-Supervised Depth Estimation

Training deep networks for semantic segmentation requires large amounts of labeled training data, which presents a major challenge in practice, as labeling segmentation masks is a highly labor-intensive process. To address this issue, we present a framework for semi-supervised and domain-adaptive semantic segmentation, which is enhanced by self-supervised monocular depth estimation (SDE) trained only on unlabeled image sequences. In particular, we utilize SDE as an auxiliary task comprehensively across the entire learning framework: First, we automatically select the most useful samples to be annotated for semantic segmentation based on the correlation of sample diversity and difficulty between SDE and semantic segmentation. Second, we implement a strong data augmentation by mixing images and labels using the geometry of the scene. Third, we transfer knowledge from features learned during SDE to semantic segmentation by means of transfer and multi-task learning. And fourth, we exploit additional labeled synthetic data with Cross-Domain DepthMix and Matching Geometry Sampling to align synthetic and real data. We validate the proposed model on the Cityscapes dataset, where all four contributions demonstrate significant performance gains, and achieve state-of-the-art results for semi-supervised semantic segmentation as well as for semi-supervised domain adaptation. In particular, with only 1/30 of the Cityscapes labels, our method achieves 92% of the fully-supervised baseline performance and even 97% when exploiting additional data from GTA. The source code is available at https://github.com/lhoyer/improving_segmentation_with_selfsupervised_depth.

* arXiv admin note: text overlap with arXiv:2012.10782 
Viaarxiv icon

Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation

May 17, 2021
Suman Saha, Anton Obukhov, Danda Pani Paudel, Menelaos Kanakis, Yuhua Chen, Stamatios Georgoulis, Luc Van Gool

Figure 1 for Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation
Figure 2 for Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation
Figure 3 for Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation
Figure 4 for Learning to Relate Depth and Semantics for Unsupervised Domain Adaptation

We present an approach for encoding visual task relationships to improve model performance in an Unsupervised Domain Adaptation (UDA) setting. Semantic segmentation and monocular depth estimation are shown to be complementary tasks; in a multi-task learning setting, a proper encoding of their relationships can further improve performance on both tasks. Motivated by this observation, we propose a novel Cross-Task Relation Layer (CTRL), which encodes task dependencies between the semantic and depth predictions. To capture the cross-task relationships, we propose a neural network architecture that contains task-specific and cross-task refinement heads. Furthermore, we propose an Iterative Self-Learning (ISL) training scheme, which exploits semantic pseudo-labels to provide extra supervision on the target domain. We experimentally observe improvements in both tasks' performance because the complementary information present in these tasks is better captured. Specifically, we show that: (1) our approach improves performance on all tasks when they are complementary and mutually dependent; (2) the CTRL helps to improve both semantic segmentation and depth estimation tasks performance in the challenging UDA setting; (3) the proposed ISL training scheme further improves the semantic segmentation performance. The implementation is available at https://github.com/susaha/ctrl-uda.

* Accepted at CVPR 2021 
Viaarxiv icon

Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation

Dec 19, 2020
Lukas Hoyer, Dengxin Dai, Yuhua Chen, Adrian Köring, Suman Saha, Luc Van Gool

Figure 1 for Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation
Figure 2 for Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation
Figure 3 for Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation
Figure 4 for Three Ways to Improve Semantic Segmentation with Self-Supervised Depth Estimation

Training deep networks for semantic segmentation requires large amounts of labeled training data, which presents a major challenge in practice, as labeling segmentation masks is a highly labor-intensive process. To address this issue, we present a framework for semi-supervised semantic segmentation, which is enhanced by self-supervised monocular depth estimation from unlabeled images. In particular, we propose three key contributions: (1) We transfer knowledge from features learned during self-supervised depth estimation to semantic segmentation, (2) we implement a strong data augmentation by blending images and labels using the structure of the scene, and (3) we utilize the depth feature diversity as well as the level of difficulty of learning depth in a student-teacher framework to select the most useful samples to be annotated for semantic segmentation. We validate the proposed model on the Cityscapes dataset, where all three modules demonstrate significant performance gains, and we achieve state-of-the-art results for semi-supervised semantic segmentation. The implementation is available at https://github.com/lhoyer/improving_segmentation_with_selfsupervised_depth.

Viaarxiv icon

mDALU: Multi-Source Domain Adaptation and Label Unification with Partial Datasets

Dec 15, 2020
Rui Gong, Dengxin Dai, Yuhua Chen, Wen Li, Luc Van Gool

Figure 1 for mDALU: Multi-Source Domain Adaptation and Label Unification with Partial Datasets
Figure 2 for mDALU: Multi-Source Domain Adaptation and Label Unification with Partial Datasets
Figure 3 for mDALU: Multi-Source Domain Adaptation and Label Unification with Partial Datasets
Figure 4 for mDALU: Multi-Source Domain Adaptation and Label Unification with Partial Datasets

Object recognition advances very rapidly these days. One challenge is to generalize existing methods to new domains, to more classes and/or to new data modalities. In order to avoid annotating one dataset for each of these new cases, one needs to combine and reuse existing datasets that may belong to different domains, have partial annotations, and/or have different data modalities. This paper treats this task as a multi-source domain adaptation and label unification (mDALU) problem and proposes a novel method for it. Our method consists of a partially-supervised adaptation stage and a fully-supervised adaptation stage. In the former, partial knowledge is transferred from multiple source domains to the target domain and fused therein. Negative transfer between unmatched label space is mitigated via three new modules: domain attention, uncertainty maximization and attention-guided adversarial alignment. In the latter, knowledge is transferred in the unified label space after a label completion process with pseudo-labels. We verify the method on three different tasks, image classification, 2D semantic image segmentation, and joint 2D-3D semantic segmentation. Extensive experiments show that our method outperforms all competing methods significantly.

* 17 pages, 10 figures, 13 tables 
Viaarxiv icon

Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation

Dec 15, 2020
Rui Gong, Yuhua Chen, Danda Pani Paudel, Yawei Li, Ajad Chhatkuli, Wen Li, Dengxin Dai, Luc Van Gool

Figure 1 for Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation
Figure 2 for Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation
Figure 3 for Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation
Figure 4 for Cluster, Split, Fuse, and Update: Meta-Learning for Open Compound Domain Adaptive Semantic Segmentation

Open compound domain adaptation (OCDA) is a domain adaptation setting, where target domain is modeled as a compound of multiple unknown homogeneous domains, which brings the advantage of improved generalization to unseen domains. In this work, we propose a principled meta-learning based approach to OCDA for semantic segmentation, MOCDA, by modeling the unlabeled target domain continuously. Our approach consists of four key steps. First, we cluster target domain into multiple sub-target domains by image styles, extracted in an unsupervised manner. Then, different sub-target domains are split into independent branches, for which batch normalization parameters are learnt to treat them independently. A meta-learner is thereafter deployed to learn to fuse sub-target domain-specific predictions, conditioned upon the style code. Meanwhile, we learn to online update the model by model-agnostic meta-learning (MAML) algorithm, thus to further improve generalization. We validate the benefits of our approach by extensive experiments on synthetic-to-real knowledge transfer benchmark datasets, where we achieve the state-of-the-art performance in both compound and open domains.

* 18 pages, 8 figures, 8 tables 
Viaarxiv icon

Analogical Image Translation for Fog Generation

Jun 28, 2020
Rui Gong, Dengxin Dai, Yuhua Chen, Wen Li, Luc Van Gool

Figure 1 for Analogical Image Translation for Fog Generation
Figure 2 for Analogical Image Translation for Fog Generation
Figure 3 for Analogical Image Translation for Fog Generation
Figure 4 for Analogical Image Translation for Fog Generation

Image-to-image translation is to map images from a given \emph{style} to another given \emph{style}. While exceptionally successful, current methods assume the availability of training images in both source and target domains, which does not always hold in practice. Inspired by humans' reasoning capability of analogy, we propose analogical image translation (AIT). Given images of two styles in the source domain: $\mathcal{A}$ and $\mathcal{A}^\prime$, along with images $\mathcal{B}$ of the first style in the target domain, learn a model to translate $\mathcal{B}$ to $\mathcal{B}^\prime$ in the target domain, such that $\mathcal{A}:\mathcal{A}^\prime ::\mathcal{B}:\mathcal{B}^\prime$. AIT is especially useful for translation scenarios in which training data of one style is hard to obtain but training data of the same two styles in another domain is available. For instance, in the case from normal conditions to extreme, rare conditions, obtaining real training images for the latter case is challenging but obtaining synthetic data for both cases is relatively easy. In this work, we are interested in adding adverse weather effects, more specifically fog effects, to images taken in clear weather. To circumvent the challenge of collecting real foggy images, AIT learns with synthetic clear-weather images, synthetic foggy images and real clear-weather images to add fog effects onto real clear-weather images without seeing any real foggy images during training. AIT achieves this zero-shot image translation capability by coupling a supervised training scheme in the synthetic domain, a cycle consistency strategy in the real domain, an adversarial training scheme between the two domains, and a novel network design. Experiments show the effectiveness of our method for zero-short image translation and its benefit for downstream tasks such as semantic foggy scene understanding.

* 18 pages, 9 figures, 7 tables 
Viaarxiv icon

Consistency Guided Scene Flow Estimation

Jun 19, 2020
Yuhua Chen, Luc Van Gool, Cordelia Schmid, Cristian Sminchisescu

Figure 1 for Consistency Guided Scene Flow Estimation
Figure 2 for Consistency Guided Scene Flow Estimation
Figure 3 for Consistency Guided Scene Flow Estimation
Figure 4 for Consistency Guided Scene Flow Estimation

We present Consistency Guided Scene Flow Estimation (CGSF), a framework for joint estimation of 3D scene structure and motion from stereo videos. The model takes two temporal stereo pairs as input, and predicts disparity and scene flow. The model self-adapts at test time by iteratively refining its predictions. The refinement process is guided by a consistency loss, which combines stereo and temporal photo-consistency with a geometric term that couples the disparity and 3D motion. To handle the noise in the consistency loss, we further propose a learned, output refinement network, which takes the initial predictions, the loss, and the gradient as input, and efficiently predicts a correlated output update. We demonstrate with extensive experiments that the proposed model can reliably predict disparity and scene flow in many challenging scenarios, and achieves better generalization than the state-of-the-arts.

Viaarxiv icon

MRI Super-Resolution with GAN and 3D Multi-Level DenseNet: Smaller, Faster, and Better

Mar 06, 2020
Yuhua Chen, Anthony G. Christodoulou, Zhengwei Zhou, Feng Shi, Yibin Xie, Debiao Li

Figure 1 for MRI Super-Resolution with GAN and 3D Multi-Level DenseNet: Smaller, Faster, and Better
Figure 2 for MRI Super-Resolution with GAN and 3D Multi-Level DenseNet: Smaller, Faster, and Better
Figure 3 for MRI Super-Resolution with GAN and 3D Multi-Level DenseNet: Smaller, Faster, and Better
Figure 4 for MRI Super-Resolution with GAN and 3D Multi-Level DenseNet: Smaller, Faster, and Better

High-resolution (HR) magnetic resonance imaging (MRI) provides detailed anatomical information that is critical for diagnosis in the clinical application. However, HR MRI typically comes at the cost of long scan time, small spatial coverage, and low signal-to-noise ratio (SNR). Recent studies showed that with a deep convolutional neural network (CNN), HR generic images could be recovered from low-resolution (LR) inputs via single image super-resolution (SISR) approaches. Additionally, previous works have shown that a deep 3D CNN can generate high-quality SR MRIs by using learned image priors. However, 3D CNN with deep structures, have a large number of parameters and are computationally expensive. In this paper, we propose a novel 3D CNN architecture, namely a multi-level densely connected super-resolution network (mDCSRN), which is light-weight, fast and accurate. We also show that with the generative adversarial network (GAN)-guided training, the mDCSRN-GAN provides appealing sharp SR images with rich texture details that are highly comparable with the referenced HR images. Our results from experiments on a large public dataset with 1,113 subjects showed that this new architecture outperformed other popular deep learning methods in recovering 4x resolution-downgraded images in both quality and speed.

* Preprint submitted to Medical Image Analysis 
Viaarxiv icon