Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yiru Wang

Keyword-Based Diverse Image Retrieval by Semantics-aware Contrastive Learning and Transformer

May 06, 2023

Minyi Zhao, Jinpeng Wang, Dongliang Liao, Yiru Wang, Huanzhong Duan, Shuigeng Zhou

Abstract:In addition to relevance, diversity is an important yet less studied performance metric of cross-modal image retrieval systems, which is critical to user experience. Existing solutions for diversity-aware image retrieval either explicitly post-process the raw retrieval results from standard retrieval systems or try to learn multi-vector representations of images to represent their diverse semantics. However, neither of them is good enough to balance relevance and diversity. On the one hand, standard retrieval systems are usually biased to common semantics and seldom exploit diversity-aware regularization in training, which makes it difficult to promote diversity by post-processing. On the other hand, multi-vector representation methods are not guaranteed to learn robust multiple projections. As a result, irrelevant images and images of rare or unique semantics may be projected inappropriately, which degrades the relevance and diversity of the results generated by some typical algorithms like top-k. To cope with these problems, this paper presents a new method called CoLT that tries to generate much more representative and robust representations for accurately classifying images. Specifically, CoLT first extracts semantics-aware image features by enhancing the preliminary representations of an existing one-to-one cross-modal system with semantics-aware contrastive learning. Then, a transformer-based token classifier is developed to subsume all the features into their corresponding categories. Finally, a post-processing algorithm is designed to retrieve images from each category to form the final retrieval result. Extensive experiments on two real-world datasets Div400 and Div150Cred show that CoLT can effectively boost diversity, and outperforms the existing methods as a whole (with a higher F1 score).

* Accepted by SIGIR2023 (long paper)

Via

Access Paper or Ask Questions

Adaptive Rotated Convolution for Rotated Object Detection

Mar 14, 2023

Yifan Pu, Yiru Wang, Zhuofan Xia, Yizeng Han, Yulin Wang, Weihao Gan, Zidong Wang, Shiji Song, Gao Huang

Figure 1 for Adaptive Rotated Convolution for Rotated Object Detection

Figure 2 for Adaptive Rotated Convolution for Rotated Object Detection

Figure 3 for Adaptive Rotated Convolution for Rotated Object Detection

Figure 4 for Adaptive Rotated Convolution for Rotated Object Detection

Abstract:Rotated object detection aims to identify and locate objects in images with arbitrary orientation. In this scenario, the oriented directions of objects vary considerably across different images, while multiple orientations of objects exist within an image. This intrinsic characteristic makes it challenging for standard backbone networks to extract high-quality features of these arbitrarily orientated objects. In this paper, we present Adaptive Rotated Convolution (ARC) module to handle the aforementioned challenges. In our ARC module, the convolution kernels rotate adaptively to extract object features with varying orientations in different images, and an efficient conditional computation mechanism is introduced to accommodate the large orientation variations of objects within an image. The two designs work seamlessly in rotated object detection problem. Moreover, ARC can conveniently serve as a plug-and-play module in various vision backbones to boost their representation ability to detect oriented objects accurately. Experiments on commonly used benchmarks (DOTA and HRSC2016) demonstrate that equipped with our proposed ARC module in the backbone network, the performance of multiple popular oriented object detectors is significantly improved (e.g. +3.03% mAP on Rotated RetinaNet and +4.16% on CFA). Combined with the highly competitive method Oriented R-CNN, the proposed approach achieves state-of-the-art performance on the DOTA dataset with 81.77% mAP.

Via

Access Paper or Ask Questions

Weighted Sum Secrecy Rate Maximization for RIS-Assisted Full Duplex systems

Jun 17, 2022

Pengxin Guan, Yiru Wang, Yuping Zhao

Figure 1 for Weighted Sum Secrecy Rate Maximization for RIS-Assisted Full Duplex systems

Figure 2 for Weighted Sum Secrecy Rate Maximization for RIS-Assisted Full Duplex systems

Figure 3 for Weighted Sum Secrecy Rate Maximization for RIS-Assisted Full Duplex systems

Figure 4 for Weighted Sum Secrecy Rate Maximization for RIS-Assisted Full Duplex systems

Abstract:This letter considers the secure communication in a reconfigurable intelligent surface (RIS) aided full duplex (FD) system. A FD base station (BS) serves an uplink (UL) user and a downlink (DL) user simultaneously over the same timefrequency dimension assisted by a RIS in the presence of an eavesdropper. In addition, the BS transmits artificial noise (AN) to interfere the eavesdropper's channel. We aim to maximize the weighted sum secrecy rate of UL and DL users by jointly optimizing the transmit beamforming, receive beamforming and AN covariance matrix at the BS, and passive beamforming at the RIS. To handle the non-convex problem, we decompose it into tractable subproblems and propose an efficient algorithm based on alternating optimization framework. Specifically, the receive beamforming is derived as a closed-form solution while other variables are obtained by using semidefinite relaxation (SDR) method and successive convex approximation (SCA) algorithm. Simulation results demonstrate the superior performance of our proposed scheme compared to other baseline schemes.

* This is our preliminary research paper. More detailed papers will be updated as soon as possible

Via

Access Paper or Ask Questions

Denial-of-Service Attacks on Learned Image Compression

May 26, 2022

Kang Liu, Di Wu, Yiru Wang, Dan Feng, Benjamin Tan, Siddharth Garg

Figure 1 for Denial-of-Service Attacks on Learned Image Compression

Figure 2 for Denial-of-Service Attacks on Learned Image Compression

Figure 3 for Denial-of-Service Attacks on Learned Image Compression

Figure 4 for Denial-of-Service Attacks on Learned Image Compression

Abstract:Deep learning techniques have shown promising results in image compression, with competitive bitrate and image reconstruction quality from compressed latent. However, while image compression has progressed towards higher peak signal-to-noise ratio (PSNR) and fewer bits per pixel (bpp), their robustness to corner-case images has never received deliberation. In this work, we, for the first time, investigate the robustness of image compression systems where imperceptible perturbation of input images can precipitate a significant increase in the bitrate of their compressed latent. To characterize the robustness of state-of-the-art learned image compression, we mount white and black-box attacks. Our results on several image compression models with various bitrate qualities show that they are surprisingly fragile, where the white-box attack achieves up to 56.326x and black-box 1.947x bpp change. To improve robustness, we propose a novel model which incorporates attention modules and a basic factorized entropy model, resulting in a promising trade-off between the PSNR/bpp ratio and robustness to adversarial attacks that surpasses existing learned image compressors.

Via

Access Paper or Ask Questions

Reconfigurable Intelligent Surfaces for Energy Efficiency in Full-duplex Communication System

May 24, 2022

Yiru Wang, Pengxin Guan, Hongkang Yu, Yuping Zhao

Figure 1 for Reconfigurable Intelligent Surfaces for Energy Efficiency in Full-duplex Communication System

Figure 2 for Reconfigurable Intelligent Surfaces for Energy Efficiency in Full-duplex Communication System

Figure 3 for Reconfigurable Intelligent Surfaces for Energy Efficiency in Full-duplex Communication System

Figure 4 for Reconfigurable Intelligent Surfaces for Energy Efficiency in Full-duplex Communication System

Abstract:In this letter, we study the reconfigurable intelligent surfaces (RIS) aided full-duplex (FD) communication system. By jointly designing the active beamforming of two multi-antenna sources and passive beamforming of RIS, we aim to maximize the energy efficiency of the system, where extra self-interference cancellation power consumption in FD system is also considered. We divide the optimization problem into active and passive beamforming design subproblems, and adopt the alternative optimization framework to solve them iteratively. Dinkelbach's method is used to tackle the fractional objective function in active beamforming problem. Penalty method and successive convex approximation are exploited for passive beamforming design. Simulation results show the energy efficiency of our scheme outperforms other benchmarks.

Via

Access Paper or Ask Questions

Cross Domain Object Detection by Target-Perceived Dual Branch Distillation

May 03, 2022

Mengzhe He, Yali Wang, Jiaxi Wu, Yiru Wang, Hanqing Li, Bo Li, Weihao Gan, Wei Wu, Yu Qiao

Figure 1 for Cross Domain Object Detection by Target-Perceived Dual Branch Distillation

Figure 2 for Cross Domain Object Detection by Target-Perceived Dual Branch Distillation

Figure 3 for Cross Domain Object Detection by Target-Perceived Dual Branch Distillation

Figure 4 for Cross Domain Object Detection by Target-Perceived Dual Branch Distillation

Abstract:Cross domain object detection is a realistic and challenging task in the wild. It suffers from performance degradation due to large shift of data distributions and lack of instance-level annotations in the target domain. Existing approaches mainly focus on either of these two difficulties, even though they are closely coupled in cross domain object detection. To solve this problem, we propose a novel Target-perceived Dual-branch Distillation (TDD) framework. By integrating detection branches of both source and target domains in a unified teacher-student learning scheme, it can reduce domain shift and generate reliable supervision effectively. In particular, we first introduce a distinct Target Proposal Perceiver between two domains. It can adaptively enhance source detector to perceive objects in a target image, by leveraging target proposal contexts from iterative cross-attention. Afterwards, we design a concise Dual Branch Self Distillation strategy for model training, which can progressively integrate complementary object knowledge from different domains via self-distillation in two branches. Finally, we conduct extensive experiments on a number of widely-used scenarios in cross domain object detection. The results show that our TDD significantly outperforms the state-of-the-art methods on all the benchmarks. Our code and model will be available at https://github.com/Feobi1999/TDD.

* CVPR2022

Via

Access Paper or Ask Questions

Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection

Apr 17, 2022

Jiaxi Wu, Jiaxin Chen, Mengzhe He, Yiru Wang, Bo Li, Bingqi Ma, Weihao Gan, Wei Wu, Yali Wang, Di Huang

Figure 1 for Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection

Figure 2 for Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection

Figure 3 for Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection

Figure 4 for Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection

Abstract:Domain adaptive object detection (DAOD) is a promising way to alleviate performance drop of detectors in new scenes. Albeit great effort made in single source domain adaptation, a more generalized task with multiple source domains remains not being well explored, due to knowledge degradation during their combination. To address this issue, we propose a novel approach, namely target-relevant knowledge preservation (TRKP), to unsupervised multi-source DAOD. Specifically, TRKP adopts the teacher-student framework, where the multi-head teacher network is built to extract knowledge from labeled source domains and guide the student network to learn detectors in unlabeled target domain. The teacher network is further equipped with an adversarial multi-source disentanglement (AMSD) module to preserve source domain-specific knowledge and simultaneously perform cross-domain alignment. Besides, a holistic target-relevant mining (HTRM) scheme is developed to re-weight the source images according to the source-target relevance. By this means, the teacher network is enforced to capture target-relevant knowledge, thus benefiting decreasing domain shift when mentoring object detection in the target domain. Extensive experiments are conducted on various widely used benchmarks with new state-of-the-art scores reported, highlighting the effectiveness.

* CVPR2022

Via

Access Paper or Ask Questions

Energy Efficiency Maximization of Simultaneous Transmission and Reflection RIS Assisted Full-Duplex Communications

Mar 14, 2022

Pengxin Guan, Yiru Wang, Hongkang Yu, Yuping Zhao

Figure 1 for Energy Efficiency Maximization of Simultaneous Transmission and Reflection RIS Assisted Full-Duplex Communications

Figure 2 for Energy Efficiency Maximization of Simultaneous Transmission and Reflection RIS Assisted Full-Duplex Communications

Abstract:This work studies the effectiveness of a novel simultaneous transmission and reflection reconfigurable intelligent surface (STAR-RIS) aided Full-Duplex (FD) communication system. We aim to maximize the energy efficiency by jointly optimizing the transmit power and passive beamforming at the STAR-RIS. We propose an efficient algorithm to optimize them iteratively under the alternating optimization framework. The successive convex approximation (SCA) and Dinkelbach's method are used to solve the power optimization subproblem. The penalty-based method is used to design passive beamforming at the STAR-RIS. Numerical results verify the convergence and effectiveness of the proposed algorithm, and further reveal the benifits of the combining of the STAR-RIS and FD communication compared to benchmarks.

Via

Access Paper or Ask Questions

Simultaneous Transmission and Reflection Reconfigurable Intelligent Surface Assisted Full-Duplex Communications

Mar 12, 2022

Yiru Wang, Pengxin Guan, Hongkang Yu, Yuping Zhao

Figure 1 for Simultaneous Transmission and Reflection Reconfigurable Intelligent Surface Assisted Full-Duplex Communications

Figure 2 for Simultaneous Transmission and Reflection Reconfigurable Intelligent Surface Assisted Full-Duplex Communications

Figure 3 for Simultaneous Transmission and Reflection Reconfigurable Intelligent Surface Assisted Full-Duplex Communications

Figure 4 for Simultaneous Transmission and Reflection Reconfigurable Intelligent Surface Assisted Full-Duplex Communications

Abstract:This work demonstrates the effectiveness of a novel simultaneous transmission and reflection reconfigurable intelligent surface (STAR-RIS) in Full-Duplex (FD) aided communication system. The objective is to minimize the total transmit power by jointly designing the transmit power and the transmitting and reflecting (T&R) coefficients of the STAR-RIS. To solve the nonconvex problem, an efficient algorithm is proposed by utilizing the alternating optimization framework to iteratively optimize variables. Specifically, in each iteration, we drive the closed-form expression for the optimal power design. The successive convex approximation (SCA) method and semidefinite program (SDP) are used to solve the passive beamforming optimization problem. Numerical results verify the convergence and effectiveness of the proposed algorithm, and further reveal in which scenarios STAR-RIS assisted FD communication defeats the Half-Duplex and conventional RIS.

Via

Access Paper or Ask Questions

Performance Analysis and Codebook Design for mmWave Beamforming System with Beam Squint

Jan 18, 2021

Hongkang Yu, Pengxin Guan, Yiru Wang, Yuping Zhao

Figure 1 for Performance Analysis and Codebook Design for mmWave Beamforming System with Beam Squint

Figure 2 for Performance Analysis and Codebook Design for mmWave Beamforming System with Beam Squint

Abstract:Beamforming technology is widely used in millimeter wave systems to combat path losses, and beamformers are usually selected from a predefined codebook. Unfortunately, traditional codebook design neglects the beam squint effect, and this will cause severe performance degradation when the bandwidth is large. In this letter, we consider that a codebook with fixed size is adopted in the wideband beamforming system. First, based on the rectangular beams with conventional beam coverage, we analyze how beam squint affects system performance and derive the expression of average spectrum efficiency. Next, we formulate optimization problem to design the optimal codebook. Simulation results demonstrate that the proposed codebook spreads beam coverage to cope with beam squint and significantly slows down the performance degradation.

Via

Access Paper or Ask Questions