Alert button
Picture for Wei Yu

Wei Yu

Alert button

Covariance-Based Activity Detection in Cooperative Multi-Cell Massive MIMO: Scaling Law and Efficient Algorithms

Nov 26, 2023
Ziyue Wang, Ya-Feng Liu, Zhaorui Wang, Wei Yu

This paper focuses on the covariance-based activity detection problem in a multi-cell massive multiple-input multiple-output (MIMO) system. In this system, active devices transmit their signature sequences to multiple base stations (BSs), and the BSs cooperatively detect the active devices based on the received signals. While the scaling law for the covariance-based activity detection in the single-cell scenario has been extensively analyzed in the literature, this paper aims to analyze the scaling law for the covariance-based activity detection in the multi-cell massive MIMO system. Specifically, this paper demonstrates a quadratic scaling law in the multi-cell system, under the assumption that the exponent in the classical path-loss model is greater than 2. This finding shows that, in the multi-cell MIMO system, the maximum number of active devices that can be detected correctly in each cell increases quadratically with the length of the signature sequence and decreases logarithmically with the number of cells (as the number of antennas tends to infinity). Moreover, in addition to analyzing the scaling law for the signature sequences randomly and uniformly distributed on a sphere, the paper also establishes the scaling law for signature sequences generated from a finite alphabet, which are easier to generate and store. Moreover, this paper proposes two efficient accelerated coordinate descent (CD) algorithms with a convergence guarantee for solving the device activity detection problem. The first algorithm reduces the complexity of CD by using an inexact coordinate update strategy. The second algorithm avoids unnecessary computations of CD by using an active set selection strategy. Simulation results show that the proposed algorithms exhibit excellent performance in terms of computational efficiency and detection error probability.

* 54 pages, 11 figures, submitted for possible publication 
Viaarxiv icon

Discerning and Enhancing the Weighted Sum-Rate Maximization Algorithms in Communications

Nov 08, 2023
Zepeng Zhang, Ziping Zhao, Kaiming Shen, Daniel P. Palomar, Wei Yu

Weighted sum-rate (WSR) maximization plays a critical role in communication system design. This paper examines three optimization methods for WSR maximization, which ensure convergence to stationary points: two block coordinate ascent (BCA) algorithms, namely, weighted sum-minimum mean-square error (WMMSE) and WSR maximization via fractional programming (WSR-FP), along with a minorization-maximization (MM) algorithm, WSR maximization via MM (WSR-MM). Our contributions are threefold. Firstly, we delineate the exact relationships among WMMSE, WSR-FP, and WSR-MM, which, despite their extensive use in the literature, lack a comprehensive comparative study. By probing the theoretical underpinnings linking the BCA and MM algorithmic frameworks, we reveal the direct correlations between the equivalent transformation techniques, essential to the development of WMMSE and WSR-FP, and the surrogate functions pivotal to WSR-MM. Secondly, we propose a novel algorithm, WSR-MM+, harnessing the flexibility of selecting surrogate functions in MM framework. By circumventing the repeated matrix inversions in the search for optimal Lagrange multipliers in existing algorithms, WSR-MM+ significantly reduces the computational load per iteration and accelerates convergence. Thirdly, we reconceptualize WSR-MM+ within the BCA framework, introducing a new equivalent transform, which gives rise to an enhanced version of WSR-FP, named as WSR-FP+. We further demonstrate that WSR-MM+ can be construed as the basic gradient projection method. This perspective yields a deeper understanding into its computational intricacies. Numerical simulations corroborate the connections between WMMSE, WSR-FP, and WSR-MM and confirm the efficacy of the proposed WSR-MM+ and WSR-FP+ algorithms.

Viaarxiv icon

Active Sensing for Localization with Reconfigurable Intelligent Surface

Oct 19, 2023
Zhongze Zhang, Tao Jiang, Wei Yu

This paper addresses an uplink localization problem in which the base station (BS) aims to locate a remote user with the aid of reconfigurable intelligent surface (RIS). This paper proposes a strategy in which the user transmits pilots over multiple time frames, and the BS adaptively adjusts the RIS reflection coefficients based on the observations already received so far in order to produce an accurate estimate of the user location at the end. This is a challenging active sensing problem for which finding an optimal solution involves a search through a complicated functional space whose dimension increases with the number of measurements. In this paper, we show that the long short-term memory (LSTM) network can be used to exploit the latent temporal correlation between measurements to automatically construct scalable information vectors (called hidden state) based on the measurements. Subsequently, the state vector can be mapped to the RIS configuration for the next time frame in a codebook-free fashion via a deep neural network (DNN). After all the measurements have been received, a final DNN can be used to map the LSTM cell state to the estimated user equipment (UE) position. Numerical result shows that the proposed active RIS design results in lower localization error as compared to existing active and nonactive methods. The proposed solution produces interpretable results and is generalizable to early stopping in the sequence of sensing stages.

* Accepted in IEEE International Conference on Communications (ICC) 2023 
Viaarxiv icon

Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models

Sep 06, 2023
Qiong Wu, Wei Yu, Yiyi Zhou, Shubin Huang, Xiaoshuai Sun, Rongrong Ji

Figure 1 for Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models
Figure 2 for Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models
Figure 3 for Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models
Figure 4 for Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models

With ever increasing parameters and computation, vision-language pre-trained (VLP) models exhibit prohibitive expenditure in downstream task adaption. Recent endeavors mainly focus on parameter efficient transfer learning (PETL) for VLP models by only updating a small number of parameters. However, excessive computational overhead still plagues the application of VLPs. In this paper, we aim at parameter and computation efficient transfer learning (PCETL) for VLP models. In particular, PCETL not only needs to limit the number of trainable parameters in VLP models, but also to reduce the computational redundancy during inference, thus enabling a more efficient transfer. To approach this target, we propose a novel dynamic architecture skipping (DAS) approach towards effective PCETL. Instead of directly optimizing the intrinsic architectures of VLP models, DAS first observes the significances of their modules to downstream tasks via a reinforcement learning (RL) based process, and then skips the redundant ones with lightweight networks, i.e., adapters, according to the obtained rewards. In this case, the VLP model can well maintain the scale of trainable parameters while speeding up its inference on downstream tasks. To validate DAS, we apply it to two representative VLP models, namely ViLT and METER, and conduct extensive experiments on a bunch of VL tasks. The experimental results not only show the great advantages of DAS in reducing computational complexity, e.g. -11.97% FLOPs of METER on VQA2.0, but also confirm its competitiveness against existing PETL methods in terms of parameter scale and performance. Our source code is given in our appendix.

Viaarxiv icon

Multi-Carrier Modulation: An Evolution from Time-Frequency Domain to Delay-Doppler Domain

Aug 03, 2023
Hai Lin, Jinhong Yuan, Wei Yu, Jingxian Wu, Lajos Hanzo

Figure 1 for Multi-Carrier Modulation: An Evolution from Time-Frequency Domain to Delay-Doppler Domain
Figure 2 for Multi-Carrier Modulation: An Evolution from Time-Frequency Domain to Delay-Doppler Domain
Figure 3 for Multi-Carrier Modulation: An Evolution from Time-Frequency Domain to Delay-Doppler Domain
Figure 4 for Multi-Carrier Modulation: An Evolution from Time-Frequency Domain to Delay-Doppler Domain

The recently proposed orthogonal delay-Doppler division multiplexing (ODDM) modulation, which is based on the new delay-Doppler (DD) domain orthogonal pulse (DDOP), is studied. A substantial benefit of the DDOP-based ODDM or general delay-Doppler domain multi-carrier (DDMC) modulation is that it achieves orthogonality with respect to the fine time and frequency resolutions of the DD domain. We first revisit the family of wireless channel models conceived for linear time-varying (LTV) channels, and then review the conventional multi-carrier (MC) modulation schemes and their design guidelines for both linear time-invariant (LTI) and LTV channels. Then we discuss the time-varying property of the LTV channels' DD domain impulse response and propose an impulse function based transmission strategy for equivalent sampled DD domain (ESDD) channels. Next, we take an in-depth look into the DDOP and the corresponding ODDM modulation to unveil its unique input-output relation for transmission over ESDD channels. Then, we point out that the conventional MC modulation design guidelines based on the Wely-Heisenberg (WH) frame theory can be relaxed without compromising its orthogonality or without violating the WH frame theory. More specifically, for a communication system having given bandwidth and duration, MC modulation signals can be designed based on a WH subset associated with sufficient (bi)orthogonality, which governs the (bi)orthogonality of the MC signal within the bandwidth and duration. This novel design guideline could potentially open up opportunities for developing future waveforms required by new applications such as communication systems associated with high delay and/or Doppler shifts, as well as integrated sensing and communications, etc.

* This paper has been submitted to the IEEE for possible publication. The supplementary material of this work will be posted at https://www.omu.ac.jp/eng/ees-sic/oddm/ 
Viaarxiv icon

Reinforcement Learning with Non-Cumulative Objective

Jul 11, 2023
Wei Cui, Wei Yu

Figure 1 for Reinforcement Learning with Non-Cumulative Objective
Figure 2 for Reinforcement Learning with Non-Cumulative Objective
Figure 3 for Reinforcement Learning with Non-Cumulative Objective
Figure 4 for Reinforcement Learning with Non-Cumulative Objective

In reinforcement learning, the objective is almost always defined as a \emph{cumulative} function over the rewards along the process. However, there are many optimal control and reinforcement learning problems in various application fields, especially in communications and networking, where the objectives are not naturally expressed as summations of the rewards. In this paper, we recognize the prevalence of non-cumulative objectives in various problems, and propose a modification to existing algorithms for optimizing such objectives. Specifically, we dive into the fundamental building block for many optimal control and reinforcement learning algorithms: the Bellman optimality equation. To optimize a non-cumulative objective, we replace the original summation operation in the Bellman update rule with a generalized operation corresponding to the objective. Furthermore, we provide sufficient conditions on the form of the generalized operation as well as assumptions on the Markov decision process under which the globally optimal convergence of the generalized Bellman updates can be guaranteed. We demonstrate the idea experimentally with the bottleneck objective, i.e., the objectives determined by the minimum reward along the process, on classical optimal control and reinforcement learning tasks, as well as on two network routing problems on maximizing the flow rates.

* 13 pages, 6 figures. To appear in IEEE Transactions on Machine Learning in Communications and Networking (TMLCN) 
Viaarxiv icon

On the Choice of Perception Loss Function for Learned Video Compression

May 30, 2023
Sadaf Salehkalaibar, Buu Phan, Jun Chen, Wei Yu, Ashish Khisti

Figure 1 for On the Choice of Perception Loss Function for Learned Video Compression
Figure 2 for On the Choice of Perception Loss Function for Learned Video Compression
Figure 3 for On the Choice of Perception Loss Function for Learned Video Compression
Figure 4 for On the Choice of Perception Loss Function for Learned Video Compression

We study causal, low-latency, sequential video compression when the output is subjected to both a mean squared-error (MSE) distortion loss as well as a perception loss to target realism. Motivated by prior approaches, we consider two different perception loss functions (PLFs). The first, PLF-JD, considers the joint distribution (JD) of all the video frames up to the current one, while the second metric, PLF-FMD, considers the framewise marginal distributions (FMD) between the source and reconstruction. Using information theoretic analysis and deep-learning based experiments, we demonstrate that the choice of PLF can have a significant effect on the reconstruction, especially at low-bit rates. In particular, while the reconstruction based on PLF-JD can better preserve the temporal correlation across frames, it also imposes a significant penalty in distortion compared to PLF-FMD and further makes it more difficult to recover from errors made in the earlier output frames. Although the choice of PLF decisively affects reconstruction quality, we also demonstrate that it may not be essential to commit to a particular PLF during encoding and the choice of PLF can be delegated to the decoder. In particular, encoded representations generated by training a system to minimize the MSE (without requiring either PLF) can be {\em near universal} and can generate close to optimal reconstructions for either choice of PLF at the decoder. We validate our results using (one-shot) information-theoretic analysis, detailed study of the rate-distortion-perception tradeoff of the Gauss-Markov source model as well as deep-learning based experiments on moving MNIST and KTH datasets.

Viaarxiv icon

Active Sensing for Two-Sided Beam Alignment and Reflection Design Using Ping-Pong Pilots

May 11, 2023
Tao Jiang, Foad Sohrabi, Wei Yu

Figure 1 for Active Sensing for Two-Sided Beam Alignment and Reflection Design Using Ping-Pong Pilots
Figure 2 for Active Sensing for Two-Sided Beam Alignment and Reflection Design Using Ping-Pong Pilots
Figure 3 for Active Sensing for Two-Sided Beam Alignment and Reflection Design Using Ping-Pong Pilots
Figure 4 for Active Sensing for Two-Sided Beam Alignment and Reflection Design Using Ping-Pong Pilots

Beam alignment is an important task for millimeter-wave (mmWave) communication, because constructing aligned narrow beams both at the transmitter (Tx) and the receiver (Rx) is crucial in terms of compensating the significant path loss in very high-frequency bands. However, beam alignment is also a highly nontrivial task because large antenna arrays typically have a limited number of radio-frequency chains, allowing only low-dimensional measurements of the high-dimensional channel. This paper considers a two-sided beam alignment problem based on an alternating ping-pong pilot scheme between Tx and Rx over multiple rounds without explicit feedback. We propose a deep active sensing framework in which two long short-term memory (LSTM) based neural networks are employed to learn the adaptive sensing strategies (i.e., measurement vectors) and to produce the final aligned beamformers at both sides. In the proposed ping-pong protocol, the Tx and the Rx alternately send pilots so that both sides can leverage local observations to sequentially design their respective sensing and data transmission beamformers. The proposed strategy can be extended to scenarios with a reconfigurable intelligent surface (RIS) for designing, in addition, the reflection coefficients at the RIS for both sensing and communications. Numerical experiments demonstrate significant and interpretable performance improvement. The proposed strategy works well even for the challenging multipath channel environments.

* This paper is accepted in IEEE Journal on Selected Areas in Information Theory 
Viaarxiv icon

Uncertainty Injection: A Deep Learning Method for Robust Optimization

Feb 27, 2023
Wei Cui, Wei Yu

Figure 1 for Uncertainty Injection: A Deep Learning Method for Robust Optimization
Figure 2 for Uncertainty Injection: A Deep Learning Method for Robust Optimization
Figure 3 for Uncertainty Injection: A Deep Learning Method for Robust Optimization
Figure 4 for Uncertainty Injection: A Deep Learning Method for Robust Optimization

This paper proposes a paradigm of uncertainty injection for training deep learning model to solve robust optimization problems. The majority of existing studies on deep learning focus on the model learning capability, while assuming the quality and accuracy of the inputs data can be guaranteed. However, in realistic applications of deep learning for solving optimization problems, the accuracy of inputs, which are the problem parameters in this case, plays a large role. This is because, in many situations, it is often costly or sometime impossible to obtain the problem parameters accurately, and correspondingly, it is highly desirable to develop learning algorithms that can account for the uncertainties in the input and produce solutions that are robust against these uncertainties. This paper presents a novel uncertainty injection scheme for training machine learning models that are capable of implicitly accounting for the uncertainties and producing statistically robust solutions. We further identify the wireless communications as an application field where uncertainties are prevalent in problem parameters such as the channel coefficients. We show the effectiveness of the proposed training scheme in two applications: the robust power loading for multiuser multiple-input-multiple-output (MIMO) downlink transmissions; and the robust power control for device-to-device (D2D) networks.

* 13 pages, 7 figures. To appear in IEEE Transactions on Wireless Communications 
Viaarxiv icon

Learning cross space mapping via DNN using large scale click-through logs

Feb 26, 2023
Wei Yu, Kuiyuan Yang, Yalong Bai, Hongxun Yao, Yong Rui

Figure 1 for Learning cross space mapping via DNN using large scale click-through logs
Figure 2 for Learning cross space mapping via DNN using large scale click-through logs
Figure 3 for Learning cross space mapping via DNN using large scale click-through logs
Figure 4 for Learning cross space mapping via DNN using large scale click-through logs

The gap between low-level visual signals and high-level semantics has been progressively bridged by continuous development of deep neural network (DNN). With recent progress of DNN, almost all image classification tasks have achieved new records of accuracy. To extend the ability of DNN to image retrieval tasks, we proposed a unified DNN model for image-query similarity calculation by simultaneously modeling image and query in one network. The unified DNN is named the cross space mapping (CSM) model, which contains two parts, a convolutional part and a query-embedding part. The image and query are mapped to a common vector space via these two parts respectively, and image-query similarity is naturally defined as an inner product of their mappings in the space. To ensure good generalization ability of the DNN, we learn weights of the DNN from a large number of click-through logs which consists of 23 million clicked image-query pairs between 1 million images and 11.7 million queries. Both the qualitative results and quantitative results on an image retrieval evaluation task with 1000 queries demonstrate the superiority of the proposed method.

* IEEE TRANSACTIONS ON MULTIMEDIA, VOL.17, NO.11, pp.2000-2007, NOVEMBER 2015  
* Accepted by IEEE Transactions on Multimedia 2015 
Viaarxiv icon