Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xuehui Wang

FLoRA: Low-Rank Core Space for N-dimension

May 23, 2024

Chongjie Si, Xuehui Wang, Xue Yang, Zhengqin Xu, Qingyun Li, Jifeng Dai, Yu Qiao, Xiaokang Yang, Wei Shen

Figure 1 for FLoRA: Low-Rank Core Space for N-dimension

Figure 2 for FLoRA: Low-Rank Core Space for N-dimension

Figure 3 for FLoRA: Low-Rank Core Space for N-dimension

Figure 4 for FLoRA: Low-Rank Core Space for N-dimension

Abstract:Adapting pre-trained foundation models for various downstream tasks has been prevalent in artificial intelligence. Due to the vast number of tasks and high costs, adjusting all parameters becomes unfeasible. To mitigate this, several fine-tuning techniques have been developed to update the pre-trained model weights in a more resource-efficient manner, such as through low-rank adjustments. Yet, almost all of these methods focus on linear weights, neglecting the intricacies of parameter spaces in higher dimensions like 4D. Alternatively, some methods can be adapted for high-dimensional parameter space by compressing changes in the original space into two dimensions and then employing low-rank matrix decomposition. However, these approaches destructs the structural integrity of the involved high-dimensional spaces. To tackle the diversity of dimensional spaces across different foundation models and provide a more precise representation of the changes within these spaces, this paper introduces a generalized parameter-efficient fine-tuning framework, FLoRA, designed for various dimensional parameter space. Specifically, utilizing Tucker decomposition, FLoRA asserts that changes in each dimensional parameter space are based on a low-rank core space which maintains the consistent topological structure with the original space. It then models the changes through this core space alongside corresponding weights to reconstruct alterations in the original space. FLoRA effectively preserves the structural integrity of the change of original N-dimensional parameter space, meanwhile decomposes it via low-rank tensor decomposition. Extensive experiments on computer vision, natural language processing and multi-modal tasks validate FLoRA's effectiveness. Codes are available at https://github.com/SJTU-DeepVisionLab/FLoRA.

Via

Access Paper or Ask Questions

Tendency-driven Mutual Exclusivity for Weakly Supervised Incremental Semantic Segmentation

Apr 19, 2024

Chongjie Si, Xuehui Wang, Xiaokang Yang, Wei Shen

Abstract:Weakly Incremental Learning for Semantic Segmentation (WILSS) leverages a pre-trained segmentation model to segment new classes using cost-effective and readily available image-level labels. A prevailing way to solve WILSS is the generation of seed areas for each new class, serving as a form of pixel-level supervision. However, a scenario usually arises where a pixel is concurrently predicted as an old class by the pre-trained segmentation model and a new class by the seed areas. Such a scenario becomes particularly problematic in WILSS, as the lack of pixel-level annotations on new classes makes it intractable to ascertain whether the pixel pertains to the new class or not. To surmount this issue, we propose an innovative, tendency-driven relationship of mutual exclusivity, meticulously tailored to govern the behavior of the seed areas and the predictions generated by the pre-trained segmentation model. This relationship stipulates that predictions for the new and old classes must not conflict whilst prioritizing the preservation of predictions for the old classes, which not only addresses the conflicting prediction issue but also effectively mitigates the inherent challenge of incremental learning - catastrophic forgetting. Furthermore, under the auspices of this tendency-driven mutual exclusivity relationship, we generate pseudo masks for the new classes, allowing for concurrent execution with model parameter updating via the resolution of a bi-level optimization problem. Extensive experiments substantiate the effectiveness of our framework, resulting in the establishment of new benchmarks and paving the way for further research in this field.

Via

Access Paper or Ask Questions

Partial Label Learning with a Partner

Dec 18, 2023

Chongjie Si, Zekun Jiang, Xuehui Wang, Yan Wang, Xiaokang Yang, Wei Shen

Figure 1 for Partial Label Learning with a Partner

Figure 2 for Partial Label Learning with a Partner

Figure 3 for Partial Label Learning with a Partner

Figure 4 for Partial Label Learning with a Partner

Abstract:In partial label learning (PLL), each instance is associated with a set of candidate labels among which only one is ground-truth. The majority of the existing works focuses on constructing robust classifiers to estimate the labeling confidence of candidate labels in order to identify the correct one. However, these methods usually struggle to rectify mislabeled samples. To help existing PLL methods identify and rectify mislabeled samples, in this paper, we introduce a novel partner classifier and propose a novel ``mutual supervision'' paradigm. Specifically, we instantiate the partner classifier predicated on the implicit fact that non-candidate labels of a sample should not be assigned to it, which is inherently accurate and has not been fully investigated in PLL. Furthermore, a novel collaborative term is formulated to link the base classifier and the partner one. During each stage of mutual supervision, both classifiers will blur each other's predictions through a blurring mechanism to prevent overconfidence in a specific label. Extensive experiments demonstrate that the performance and disambiguation ability of several well-established stand-alone and deep-learning based PLL approaches can be significantly improved by coupling with this learning paradigm.

* 2024, AAAI

Via

Access Paper or Ask Questions

Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors

Dec 08, 2023

Tongkun Guan, Wei Shen, Xue Yang, Xuehui Wang, Xiaokang Yang

Figure 1 for Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors

Figure 2 for Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors

Figure 3 for Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors

Figure 4 for Bridging Synthetic and Real Worlds for Pre-training Scene Text Detectors

Abstract:Existing scene text detection methods typically rely on extensive real data for training. Due to the lack of annotated real images, recent works have attempted to exploit large-scale labeled synthetic data (LSD) for pre-training text detectors. However, a synth-to-real domain gap emerges, further limiting the performance of text detectors. Differently, in this work, we propose \textbf{FreeReal}, a real-domain-aligned pre-training paradigm that enables the complementary strengths of both LSD and unlabeled real data (URD). Specifically, to bridge real and synthetic worlds for pre-training, a novel glyph-based mixing mechanism (GlyphMix) is tailored for text images. GlyphMix delineates the character structures of synthetic images and embeds them as graffiti-like units onto real images. Without introducing real domain drift, GlyphMix freely yields real-world images with annotations derived from synthetic labels. Furthermore, when given free fine-grained synthetic labels, GlyphMix can effectively bridge the linguistic domain gap stemming from English-dominated LSD to URD in various languages. Without bells and whistles, FreeReal achieves average gains of 4.56\%, 3.85\%, 3.90\%, and 1.97\% in improving the performance of DBNet, PANet, PSENet, and FCENet methods, respectively, consistently outperforming previous pre-training methods by a substantial margin across four public datasets. Code will be released soon.

Via

Access Paper or Ask Questions

Beamforming Design for Hybrid IRS-aided AF Relay Wireless Networks

Nov 23, 2023

Xuehui Wang, Yifan Zhao, Feng Shu, Yan Wang

Abstract:In this paper, a hybrid IRS-aided amplify-and-forward (AF) relay wireless network is put forward, where the hybrid IRS is made up of passive and active elements. For maximum signal-to-noise ratio (SNR), a low-complexity method based on successive convex approximation and fractional programming (LC-SCA-FP) is proposed to jointly optimize the beamforming matrix at AF relay and the reflecting coefficient matrices at IRS. Simulation results verify that the rate achieved by the proposed LC-SCA-FP method surpass those of the benchmark schemes, namely the passive IRS-aided AF relay and only AF relay network.

* arXiv admin note: substantial text overlap with arXiv:2301.02858

Via

Access Paper or Ask Questions

High-performance Power Allocation Strategies for Active IRS-aided Wireless Network

Nov 10, 2023

Yifan Zhao, Xuehui Wang, Yan Wang, Xianpeng Wang, Zhilin Chen, Feng Shu, Chunhua Pan, Jiangzhou Wang

Abstract:Due to its intrinsic ability to combat the double fading effect, the active intelligent reflective surface (IRS) becomes popular. The main feature of active IRS must be supplied by power, and the problem of how to allocate the total power between base station (BS) and IRS to fully explore the rate gain achieved by power allocation (PA) to remove the rate gap between existing PA strategies and optimal exhaustive search (ES) arises naturally. First, the signal-to-noise ratio (SNR) expression is derived to be a function of PA factor beta [0, 1]. Then, to improve the rate performance of the conventional gradient ascent (GA), an equal-spacing-multiple-point-initialization GA (ESMPI-GA) method is proposed. Due to its slow linear convergence from iterative GA, the proposed ESMPI-GA is high-complexity. Eventually, to reduce this high complexity, a low-complexity closed-form PA method with third-order Taylor expansion (TTE) centered at point beta0 = 0.5 is proposed. Simulation results show that the proposed ESMPI-GA harvests about 0.5 bit gain over conventional GA and 1.2 and 0.8 bits gain over existing methods like equal PA and Taylor polynomial approximation (TPA) for small-scale IRS, and the proposed TTE performs much better than TPA and fixed PA strategies using an extremely low complexity.

Via

Access Paper or Ask Questions

Beamforming Design for IRS-and-UAV-aided Two-way Amplify-and-Forward Relay Networks

Jun 01, 2023

Xuehui Wang, Feng Shu, Yuanyuan Wu, Shihao Yan, Yifan Zhao, Qiankun Cheng, Jiangzhou Wang

Abstract:As a promising solution to improve communication quality, unmanned aerial vehicle (UAV) has been widely integrated into wireless networks. In this paper, for the sake of enhancing the message exchange rate between User1 (U1) and User2 (U2), an intelligent reflective surface (IRS)-and-UAV- assisted two-way amplify-and-forward (AF) relay wireless system is proposed, where U1 and U2 can communicate each other via a UAV-mounted IRS and an AF relay. Besides, an optimization problem of maximizing minimum rate is casted, where the variables, namely AF relay beamforming matrix and IRS phase shifts of two time slots, need to be optimized. To achieve a maximum rate, a low-complexity alternately iterative (AI) scheme based on zero forcing and successive convex approximation (LC-ZF-SCA) algorithm is put forward, where the expression of AF relay beamforming matrix can be derived in semi-closed form by ZF method, and IRS phase shift vectors of two time slots can be respectively optimized by utilizing SCA algorithm. To obtain a significant rate enhancement, a high-performance AI method based on one step, semidefinite programming and penalty SCA (ONS-SDP-PSCA) is proposed, where the beamforming matrix at AF relay can be firstly solved by singular value decomposition and ONS method, IRS phase shift matrices of two time slots are optimized by SDP and PSCA algorithms. Simulation results present that the rate performance of the proposed LC-ZF-SCA and ONS-SDP-PSCA methods surpass those of random phase and only AF relay. In particular, when total transmit power is equal to 30dBm, the proposed two methods can harvest more than 68.5% rate gain compared to random phase and only AF relay. Meanwhile, the rate performance of ONS-SDP-PSCA method at cost of extremely high complexity is superior to that of LC-ZF-SCA method.

Via

Access Paper or Ask Questions

SVCNet: Scribble-based Video Colorization Network with Temporal Aggregation

Mar 21, 2023

Yuzhi Zhao, Lai-Man Po, Kangcheng Liu, Xuehui Wang, Wing-Yin Yu, Pengfei Xian, Yujia Zhang, Mengyang Liu

Abstract:In this paper, we propose a scribble-based video colorization network with temporal aggregation called SVCNet. It can colorize monochrome videos based on different user-given color scribbles. It addresses three common issues in the scribble-based video colorization area: colorization vividness, temporal consistency, and color bleeding. To improve the colorization quality and strengthen the temporal consistency, we adopt two sequential sub-networks in SVCNet for precise colorization and temporal smoothing, respectively. The first stage includes a pyramid feature encoder to incorporate color scribbles with a grayscale frame, and a semantic feature encoder to extract semantics. The second stage finetunes the output from the first stage by aggregating the information of neighboring colorized frames (as short-range connections) and the first colorized frame (as a long-range connection). To alleviate the color bleeding artifacts, we learn video colorization and segmentation simultaneously. Furthermore, we set the majority of operations on a fixed small image resolution and use a Super-resolution Module at the tail of SVCNet to recover original sizes. It allows the SVCNet to fit different image resolutions at the inference. Finally, we evaluate the proposed SVCNet on DAVIS and Videvo benchmarks. The experimental results demonstrate that SVCNet produces both higher-quality and more temporally consistent videos than other well-known video colorization approaches. The codes and models can be found at https://github.com/zhaoyuzhi/SVCNet.

* under revision of IEEE Transactions on Image Processing

Via

Access Paper or Ask Questions

Beamforming Design for RIS-Aided AF Relay Networks

Feb 28, 2023

Xuehui Wang, Feng Shu, Riqing Chen, Peng Zhang, Qi Zhang, Guiyang Xia, Weiping shi, Jiangzhou Wang

Abstract:Since reconfigurable intelligent surface (RIS) is considered to be a passive reflector for rate performance enhancement, a RIS-aided amplify-and-forward (AF) relay network is presented. By jointly optimizing the beamforming matrix at AF relay and the phase shifts matrices at RIS, two schemes are put forward to address a maximizing signal-to-noise ratio (SNR) problem. Firstly, aiming at achieving a high rate, a high-performance alternating optimization (AO) method based on Charnes-Cooper transformation and semidefinite programming (CCT-SDP) is proposed, where the optimization problem is decomposed to three subproblems solved by CCT-SDP and rank-one solutions can be recovered by Gaussian randomization. While the optimization variables in CCT-SDP method are matrices, which leads to extremely high complexity. In order to reduce the complexity, a low-complexity AO scheme based on Dinkelbachs transformation and successive convex approximation (DT-SCA) is put forward, where matrices variables are transformed to vector variables and three decoupled subproblems are solved by DT-SCA. Simulation results verify that compared to two benchmarks (i.e. a RIS-assisted AF relay network with random phase and a AF relay network without RIS), the proposed CCT-SDP and DT-SCA schemes can harvest better rate performance. Furthermore, it is revealed that the rate of the low-complexity DT-SCA method is close to that of CCT-SDP method.

Via

Access Paper or Ask Questions

Precoding and Beamforming Design for Intelligent Reconfigurable Surface-Aided Hybrid Secure Spatial Modulation

Feb 15, 2023

Feng Shu, Lili Yang, Yan Wang, Xuehui Wang, Weiping Shi, Chong Shen, Jiangzhou Wang

Abstract:Intelligent reflecting surface (IRS) is an emerging technology for wireless communication composed of a large number of low-cost passive devices with reconfigurable parameters, which can reflect signals with a certain phase shift and is capable of building programmable communication environment. In this paper, to avoid the high hardware cost and energy consumption in spatial modulation (SM), an IRS-aided hybrid secure SM (SSM) system with a hybrid precoder is proposed. To improve the security performance, we formulate an optimization problem to maximize the secrecy rate (SR) by jointly optimizing the beamforming at IRS and hybrid precoding at the transmitter. Considering that the SR has no closed form expression, an approximate SR (ASR) expression is derived as the objective function. To improve the SR performance, three IRS beamforming methods, called IRS alternating direction method of multipliers (IRS-ADMM), IRS block coordinate ascend (IRS-BCA) and IRS semi-definite relaxation (IRS-SDR), are proposed. As for the hybrid precoding design, approximated secrecy rate-successive convex approximation (ASR-SCA) method and cut-off rate-gradient ascend (COR-GA) method are proposed. Simulation results demonstrate that the proposed IRS-SDR and IRS-ADMM beamformers harvest substantial SR performance gains over IRS-BCA. Particularly, the proposed IRS-ADMM and IRS-BCA are of low-complexity at the expense of a little performance loss compared with IRS-SDR. For hybrid precoding, the proposed ASR-SCA performs better than COR-GA in the high transmit power region.

* 14pages,8figures

Via

Access Paper or Ask Questions