As it is cumbersome and expensive to acquire a huge amount of data for training neural dialog models, data augmentation is proposed to effectively utilize existing training samples. However, current data augmentation techniques on the dialog generation task mostly augment all cases in the training dataset without considering the intrinsic attributes between different cases. We argue that not all cases are beneficial for augmentation task, and the cases suitable for augmentation should obey the following two attributes: (1) low-quality (the dialog model cannot generate a high-quality response for the case), (2) representative (the case should represent the property of the whole dataset). Herein, we explore this idea by proposing a Selective Data Augmentation framework (SDA) for the response generation task. SDA employs a dual adversarial network to select the lowest quality and most representative data points for augmentation in one stage. Extensive experiments conducted on two publicly available datasets, i.e., DailyDialog and OpenSubtitles, show that our framework can improve the response generation performance with respect to various metrics.
In this paper, we investigate the uplink transmit power optimization problem in cell-free (CF) extremely large-scale multiple-input multiple-output (XL-MIMO) systems. Instead of applying the traditional methods, we propose two signal processing architectures: the centralized training and centralized execution with fuzzy logic as well as the centralized training and decentralized execution with fuzzy logic, respectively, which adopt the amalgamation of multi-agent reinforcement learning (MARL) and fuzzy logic to solve the design problem of power control for the maximization of the system spectral efficiency (SE). Furthermore, the uplink performance of the system adopting maximum ratio (MR) combining and local minimum mean-squared error (L-MMSE) combining is evaluated. Our results show that the proposed methods with fuzzy logic outperform the conventional MARL-based method and signal processing methods in terms of computational complexity. Also, the SE performance under MR combining is even better than that of the conventional MARL-based method.
In this paper, we investigate the uplink performance of cell-free (CF) extremely large-scale multiple-input-multipleoutput (XL-MIMO) systems, which is a promising technique for future wireless communications. More specifically, we consider the practical scenario with multiple base stations (BSs) and multiple user equipments (UEs). To this end, we derive exact achievable spectral efficiency (SE) expressions for any combining scheme. It is worth noting that we derive the closed-form SE expressions for the CF XL-MIMO with maximum ratio (MR) combining. Numerical results show that the SE performance of the CF XL-MIMO can be hugely improved compared with the small-cell XL-MIMO. It is interesting that a smaller antenna spacing leads to a higher correlation level among patch antennas. Finally, we prove that increasing the number of UE antennas may decrease the SE performance with MR combining.
Extremely large-scale multiple-input-multiple-output (XL-MIMO) is a promising technology for the future sixth-generation (6G) networks to achieve higher performance. In practice, various linear precoding schemes, such as zero-forcing (ZF) and regularized zero-forcing (RZF) precoding, are capable of achieving both large spectral efficiency (SE) and low bit error rate (BER) in traditional massive MIMO (mMIMO) systems. However, these methods are not efficient in extremely large-scale regimes due to the inherent spatial non-stationarity and high computational complexity. To address this problem, we investigate a low-complexity precoding algorithm, e.g., randomized Kaczmarz (rKA), taking into account the spatial non-stationary properties in XL-MIMO systems. Furthermore, we propose a novel mode of randomization, i.e., sampling without replacement rKA (SwoR-rKA), which enjoys a faster convergence speed than the rKA algorithm. Besides, the closed-form expression of SE considering the interference between subarrays in downlink XL-MIMO systems is derived. Numerical results show that the complexity given by both rKA and SwoR-rKA algorithms has 51.3% reduction than the traditional RZF algorithm with similar SE performance. More importantly, our algorithms can effectively reduce the BER when the transmitter has imperfect channel estimation.
We propose a new problem called audio-visual segmentation (AVS), in which the goal is to output a pixel-level map of the object(s) that produce sound at the time of the image frame. To facilitate this research, we construct the first audio-visual segmentation benchmark, i.e., AVSBench, providing pixel-wise annotations for sounding objects in audible videos. It contains three subsets: AVSBench-object (Single-source subset, Multi-sources subset) and AVSBench-semantic (Semantic-labels subset). Accordingly, three settings are studied: 1) semi-supervised audio-visual segmentation with a single sound source; 2) fully-supervised audio-visual segmentation with multiple sound sources, and 3) fully-supervised audio-visual semantic segmentation. The first two settings need to generate binary masks of sounding objects indicating pixels corresponding to the audio, while the third setting further requires generating semantic maps indicating the object category. To deal with these problems, we propose a new baseline method that uses a temporal pixel-wise audio-visual interaction module to inject audio semantics as guidance for the visual segmentation process. We also design a regularization loss to encourage audio-visual mapping during training. Quantitative and qualitative experiments on AVSBench compare our approach to several existing methods for related tasks, demonstrating that the proposed method is promising for building a bridge between the audio and pixel-wise visual semantics. Code is available at https://github.com/OpenNLPLab/AVSBench. Online benchmark is available at http://www.avlbench.opennlplab.cn.
In this paper, we investigate a cell-free massive multiple-input multiple-output system with both access points and user equipments equipped with multiple antennas over the Weichselberger Rayleigh fading channel. We study the uplink spectral efficiency (SE) for the fully centralized processing scheme and large-scale fading decoding (LSFD) scheme. To further improve the SE performance, we design the uplink precoding schemes based on the weighted sum SE maximization. Since the weighted sum SE maximization problem is not jointly over all optimization variables, two efficient uplink precoding schemes based on Iteratively Weighted sum-Minimum Mean Square Error (I-WMMSE) algorithms, which rely on the iterative minimization of weighted MSE, are proposed for two processing schemes investigated. Furthermore, with maximum ratio combining applied in the LSFD scheme, we derive novel closed-form achievable SE expressions and optimal precoding schemes. Numerical results validate the proposed results and show that the I-WMMSE precoding schemes can achieve excellent sum SE performance with a large number of UE antennas.
Cell-free massive multiple-input multiple-output (CF mMIMO) provides good interference management by coordinating many more access points (APs) than user equipments (UEs). It becomes challenging to determine which APs should serve which UEs with which pilots when the number of UEs approximates the number of APs and far exceeds the number of pilots. Compared to the previous work, a better compromise between spectral efficiency (SE) and implementation simplicity is needed in such massive access scenarios. This paper proposes an interference-aware massive access (IAMA) scheme realizing joint AP-UE association and pilot assignment for CF mMIMO by exploiting the large-scale interference features. We propose an interference-aware reward as a novel performance metric and use it to develop two iterative algorithms to optimize the association and pilot assignment. The numerical results show a prominent advantage of our IAMA scheme over the benchmark schemes in terms of the user fairness and the average SE.
Cell-free massive MIMO (CF mMIMO) is a promising next generation wireless architecture to realize federated learning (FL). However, sensitive information of user equipments (UEs) may be exposed to the involved access points or the central processing unit in practice. To guarantee data privacy, effective privacy-preserving mechanisms are defined in this paper. In particular, we demonstrate and characterize the possibility in exploiting the inherent quantization error, caused by low-resolution analog-to-digital converters (ADCs) and digital-to-analog converters (DACs), for privacy-preserving in a FL CF mMIMO system. Furthermore, to reduce the required uplink training time in such a system, a stochastic non-convex design problem that jointly optimizing the transmit power and the data rate is formulated. To address the problem at hand, we propose a novel power control method by utilizing the successive convex approximation approach to obtain a suboptimal solution. Besides, an asynchronous protocol is established for mitigating the straggler effect to facilitate FL. Numerical results show that compared with the conventional full power transmission, adopting the proposed power control method can effectively reduce the uplink training time under various practical system settings. Also, our results unveil that our proposed asynchronous approach can reduce the waiting time at the central processing unit for receiving all user information, as there are no stragglers that requires a long time to report their local updates.