Collaborative filtering (CF) is an important research direction in recommender systems that aims to make recommendations given the information on user-item interactions. Graph CF has attracted more and more attention in recent years due to its effectiveness in leveraging high-order information in the user-item bipartite graph for better recommendations. Specifically, recent studies show the success of graph neural networks (GNN) for CF is attributed to its low-pass filtering effects. However, current researches lack a study of how different signal components contributes to recommendations, and how to design strategies to properly use them well. To this end, from the view of spectral transformation, we analyze the important factors that a graph filter should consider to achieve better performance. Based on the discoveries, we design JGCF, an efficient and effective method for CF based on Jacobi polynomial bases and frequency decomposition strategies. Extensive experiments on four widely used public datasets show the effectiveness and efficiency of the proposed methods, which brings at most 27.06% performance gain on Alibaba-iFashion. Besides, the experimental results also show that JGCF is better at handling sparse datasets, which shows potential in making recommendations for cold-start users.
Recommender system has deeply revolutionized people's daily life and production, bringing a large amount of business value. In the recommendation domain, simulation and real data-based studies are two typical research paradigms, with each having different advantages. Previously, real data-based studies occupy more important positions, since accurately simulating the user preference is quite difficult. Recently, large language models (LLM) have shown great potential to achieve human-like intelligence, which provides new opportunities to overcome the shortcomings of simulation-based studies and thus highlight their advantages, such as much more application scenarios and cheaper data acquisition strategies. To shed lights on this direction, in this paper, we introduce an LLM-based recommender simulator called RecAgent. Our simulator is composed of two modules: (1) the user module and (2) the recommender module. The user module can browse the recommendation website, communicate with other users and broadcast messages on the social media. The recommender module is designed to provide search or recommendation lists to the users, and one can design different models to implement the recommender. All the users take actions based on LLMs, and can freely evolve like in the real world. We present several case studies to demonstrate that the users in our simulator can indeed behave in a reasonable manner as expected. Our project has been released at https://github.com/RUC-GSAI/YuLan-Rec.
In this paper, we propose a novel integrated sensing and communication (ISAC) complex convolution neural network (CNN) CSI enhancer for 6G networks, which exploits the correlation between the sensing parameters, such as angle-of-arrival (AoA) and range, and the channel state information (CSI) to significantly improve the CSI estimation accuracy and further enhance the sensing accuracy. The ISAC complex CNN CSI enhancer uses the complex-value computation layers to form the CNN to better maintain the phase information of CSI. Furthermore, we incorporate the ISAC transform modules into the CNN enhancer to transform the CSI into the sparse angle-delay domain, which can be treated as images with prominent peaks and are suitable to be processed by CNN. Then, we further propose a novel biased FFT-based sensing scheme, where we actively add known phase bias terms to the original CSI to generate multiple estimation results using a simple FFT-based sensing method, and we finally calculate the average of all the debiased sensing results to obtain more accurate range estimates. The extensive simulation results show that the ISAC complex CNN CSI enhancer can converge within 30 training epochs. Its CSI estimation normalized mean square error (NMSE) is about 17 dB lower than the MMSE method, and the bit error rate (BER) of demodulation using the enhanced CSI approaches the perfect CSI. Finally, the range estimation MSE of the proposed biased FFT-based sensing method can approach the subspace-based method with much lower complexity.
Embedding models have shown great power in knowledge graph completion (KGC) task. By learning structural constraints for each training triple, these methods implicitly memorize intrinsic relation rules to infer missing links. However, this paper points out that the multi-hop relation rules are hard to be reliably memorized due to the inherent deficiencies of such implicit memorization strategy, making embedding models underperform in predicting links between distant entity pairs. To alleviate this problem, we present Vertical Learning Paradigm (VLP), which extends embedding models by allowing to explicitly copy target information from related factual triples for more accurate prediction. Rather than solely relying on the implicit memory, VLP directly provides additional cues to improve the generalization ability of embedding models, especially making the distant link prediction significantly easier. Moreover, we also propose a novel relative distance based negative sampling technique (ReD) for more effective optimization. Experiments demonstrate the validity and generality of our proposals on two standard benchmarks. Our code is available at https://github.com/rui9812/VLP.
Data sparsity is an important issue for click-through rate (CTR) prediction, particularly when user-item interactions is too sparse to learn a reliable model. Recently, many works on cross-domain CTR (CDCTR) prediction have been developed in an effort to leverage meaningful data from a related domain. However, most existing CDCTR works have an impractical limitation that requires homogeneous inputs (\textit{i.e.} shared feature fields) across domains, and CDCTR with heterogeneous inputs (\textit{i.e.} varying feature fields) across domains has not been widely explored but is an urgent and important research problem. In this work, we propose a cross-domain augmentation network (CDAnet) being able to perform knowledge transfer between two domains with \textit{heterogeneous inputs}. Specifically, CDAnet contains a designed translation network and an augmentation network which are trained sequentially. The translation network is able to compute features from two domains with heterogeneous inputs separately by designing two independent branches, and then learn meaningful cross-domain knowledge using a designed cross-supervised feature translator. Later the augmentation network encodes the learned cross-domain knowledge via feature translation performed in the latent space and fine-tune the model for final CTR prediction. Through extensive experiments on two public benchmarks and one industrial production dataset, we show CDAnet can learn meaningful translated features and largely improve the performance of CTR prediction. CDAnet has been conducted online A/B test in image2product retrieval at Taobao app over 20days, bringing an absolute \textbf{0.11 point} CTR improvement and a relative \textbf{1.26\%} GMV increase.
While progress in 2D generative models of human appearance has been rapid, many applications require 3D avatars that can be animated and rendered. Unfortunately, most existing methods for learning generative models of 3D humans with diverse shape and appearance require 3D training data, which is limited and expensive to acquire. The key to progress is hence to learn generative models of 3D avatars from abundant unstructured 2D image collections. However, learning realistic and complete 3D appearance and geometry in this under-constrained setting remains challenging, especially in the presence of loose clothing such as dresses. In this paper, we propose a new adversarial generative model of realistic 3D people from 2D images. Our method captures shape and deformation of the body and loose clothing by adopting a holistic 3D generator and integrating an efficient and flexible articulation module. To improve realism, we train our model using multiple discriminators while also integrating geometric cues in the form of predicted 2D normal maps. We experimentally find that our method outperforms previous 3D- and articulation-aware methods in terms of geometry and appearance. We validate the effectiveness of our model and the importance of each component via systematic ablation studies.
Provisioning dynamic machine learning (ML) inference as a service for artificial intelligence (AI) applications of edge devices faces many challenges, including the trade-off among accuracy loss, carbon emission, and unknown future costs. Besides, many governments are launching carbon emission rights (CER) for operators to reduce carbon emissions further to reverse climate change. Facing these challenges, to achieve carbon-aware ML task offloading under limited carbon emission rights thus to achieve green edge AI, we establish a joint ML task offloading and CER purchasing problem, intending to minimize the accuracy loss under the long-term time-averaged cost budget of purchasing the required CER. However, considering the uncertainty of the resource prices, the CER purchasing prices, the carbon intensity of sites, and ML tasks' arrivals, it is hard to decide the optimal policy online over a long-running period time. To overcome this difficulty, we leverage the two-timescale Lyapunov optimization technique, of which the $T$-slot drift-plus-penalty methodology inspires us to propose an online algorithm that purchases CER in multiple timescales (on-preserved in carbon future market and on-demanded in the carbon spot market) and makes decisions about where to offload ML tasks. Considering the NP-hardness of the $T$-slot problems, we further propose the resource-restricted randomized dependent rounding algorithm to help to gain the near-optimal solution with no help of any future information. Our theoretical analysis and extensive simulation results driven by the real carbon intensity trace show the superior performance of the proposed algorithms.
The joint communication and sensing (JCS) system can provide higher spectrum efficiency and load-saving for 6G machine-type communication (MTC) applications by merging necessary communication and sensing abilities with unified spectrum and transceivers. In order to suppress the mutual interference between the communication and radar sensing signals to improve the communication reliability and radar sensing accuracy, we propose a novel code-division orthogonal frequency division multiplex (CD-OFDM) JCS MTC system, where MTC users can simultaneously and continuously conduct communication and sensing with each other. {\color{black} We propose a novel CD-OFDM JCS signal and corresponding successive-interference-cancellation (SIC) based signal processing technique that obtains code-division multiplex (CDM) gain, which is compatible with the prevalent orthogonal frequency division multiplex (OFDM) communication system.} To model the unified JCS signal transmission and reception process, we propose a novel unified JCS channel model. Finally, the simulation and numerical results are shown to verify the feasibility of the CD-OFDM JCS MTC system {\color{black} and the error propagation performance}. We show that the CD-OFDM JCS MTC system can achieve not only more reliable communication but also comparably robust radar sensing compared with the precedent OFDM JCS system, especially in low signal-to-interference-and-noise ratio (SINR) regime.
We propose a novel cooperative joint sensing-communication (JSC) unmanned aerial vehicle (UAV) network that can achieve downward-looking detection and transmit detection data simultaneously using the same time and frequency resources by exploiting the beam sharing scheme. The UAV network consists of a UAV that works as a fusion center (FCUAV) and multiple subordinate UAVs (SU). All UAVs fly at the fixed height. FCUAV integrates the sensing data of network and carries out downward-looking detection. SUs carry out downward-looking detection and transmit the sensing data to FCUAV. To achieve the beam sharing scheme, each UAV is equipped with a novel JSC antenna array that is composed of both the sensing subarray (SenA) and the communication subarray (ComA) in order to generate the sensing beam (SenB) and the communication beam (ComB) for detection and communication, respectively. SenB and ComB of each UAV share a total amount of radio power. Because of the spatial orthogonality of communication and sensing, SenB and ComB can be easily formed orthogonally. The upper bound of average cooperative sensing area (UB-ACSA) is defined as the metric to measure the sensing performance, which is related to the mutual sensing interference and the communication capacity. Numerical simulations prove the validity of the theoretical expressions for UB-ACSA of the network. The optimal number of UAVs and the optimal SenB power are identified under the total power constraint.
Sequential recommender systems train their models based on a large amount of implicit user feedback data and may be subject to biases when users are systematically under/over-exposed to certain items. Unbiased learning based on inverse propensity scores (IPS), which estimate the probability of observing a user-item pair given the historical information, has been proposed to address the issue. In these methods, propensity score estimation is usually limited to the view of item, that is, treating the feedback data as sequences of items that interacted with the users. However, the feedback data can also be treated from the view of user, as the sequences of users that interact with the items. Moreover, the two views can jointly enhance the propensity score estimation. Inspired by the observation, we propose to estimate the propensity scores from the views of user and item, called Dually Enhanced Propensity Score Estimation (DEPS). Specifically, given a target user-item pair and the corresponding item and user interaction sequences, DEPS firstly constructs a time-aware causal graph to represent the user-item observational probability. According to the graph, two complementary propensity scores are estimated from the views of item and user, respectively, based on the same set of user feedback data. Finally, two transformers are designed to make the final preference prediction. Theoretical analysis showed the unbiasedness and variance of DEPS. Experimental results on three publicly available and an industrial datasets demonstrated that DEPS can significantly outperform the state-of-the-art baselines.