Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rui Zhang

Henry

PMG : Personalized Multimodal Generation with Large Language Models

Apr 07, 2024

Xiaoteng Shen, Rui Zhang, Xiaoyan Zhao, Jieming Zhu, Xi Xiao

Figure 1 for PMG : Personalized Multimodal Generation with Large Language Models

Figure 2 for PMG : Personalized Multimodal Generation with Large Language Models

Figure 3 for PMG : Personalized Multimodal Generation with Large Language Models

Figure 4 for PMG : Personalized Multimodal Generation with Large Language Models

Abstract:The emergence of large language models (LLMs) has revolutionized the capabilities of text comprehension and generation. Multi-modal generation attracts great attention from both the industry and academia, but there is little work on personalized generation, which has important applications such as recommender systems. This paper proposes the first method for personalized multimodal generation using LLMs, showcases its applications and validates its performance via an extensive experimental study on two datasets. The proposed method, Personalized Multimodal Generation (PMG for short) first converts user behaviors (e.g., clicks in recommender systems or conversations with a virtual assistant) into natural language to facilitate LLM understanding and extract user preference descriptions. Such user preferences are then fed into a generator, such as a multimodal LLM or diffusion model, to produce personalized content. To capture user preferences comprehensively and accurately, we propose to let the LLM output a combination of explicit keywords and implicit embeddings to represent user preferences. Then the combination of keywords and embeddings are used as prompts to condition the generator. We optimize a weighted sum of the accuracy and preference scores so that the generated content has a good balance between them. Compared to a baseline method without personalization, PMG has a significant improvement on personalization for up to 8% in terms of LPIPS while retaining the accuracy of generation.

Via

Access Paper or Ask Questions

RAT: Retrieval-Augmented Transformer for Click-Through Rate Prediction

Apr 05, 2024

Yushen Li, Jinpeng Wang, Tao Dai, Jieming Zhu, Jun Yuan, Rui Zhang, Shu-Tao Xia

Abstract:Predicting click-through rates (CTR) is a fundamental task for Web applications, where a key issue is to devise effective models for feature interactions. Current methodologies predominantly concentrate on modeling feature interactions within an individual sample, while overlooking the potential cross-sample relationships that can serve as a reference context to enhance the prediction. To make up for such deficiency, this paper develops a Retrieval-Augmented Transformer (RAT), aiming to acquire fine-grained feature interactions within and across samples. By retrieving similar samples, we construct augmented input for each target sample. We then build Transformer layers with cascaded attention to capture both intra- and cross-sample feature interactions, facilitating comprehensive reasoning for improved CTR prediction while retaining efficiency. Extensive experiments on real-world datasets substantiate the effectiveness of RAT and suggest its advantage in long-tail scenarios. The code has been open-sourced at \url{https://github.com/YushenLi807/WWW24-RAT}.

* Accepted to The ACM Web Conference 2024 (WWW'24, short paper). Data and code are available

Via

Access Paper or Ask Questions

Evaluating LLMs at Detecting Errors in LLM Responses

Apr 04, 2024

Ryo Kamoi, Sarkar Snigdha Sarathi Das, Renze Lou, Jihyun Janice Ahn, Yilun Zhao, Xiaoxin Lu, Nan Zhang, Yusen Zhang, Ranran Haoran Zhang, Sujeeth Reddy Vummanthala(+5 more)

Figure 1 for Evaluating LLMs at Detecting Errors in LLM Responses

Figure 2 for Evaluating LLMs at Detecting Errors in LLM Responses

Figure 3 for Evaluating LLMs at Detecting Errors in LLM Responses

Figure 4 for Evaluating LLMs at Detecting Errors in LLM Responses

Abstract:With Large Language Models (LLMs) being widely used across various tasks, detecting errors in their responses is increasingly crucial. However, little research has been conducted on error detection of LLM responses. Collecting error annotations on LLM responses is challenging due to the subjective nature of many NLP tasks, and thus previous research focuses on tasks of little practical value (e.g., word sorting) or limited error types (e.g., faithfulness in summarization). This work introduces ReaLMistake, the first error detection benchmark consisting of objective, realistic, and diverse errors made by LLMs. ReaLMistake contains three challenging and meaningful tasks that introduce objectively assessable errors in four categories (reasoning correctness, instruction-following, context-faithfulness, and parameterized knowledge), eliciting naturally observed and diverse errors in responses of GPT-4 and Llama 2 70B annotated by experts. We use ReaLMistake to evaluate error detectors based on 12 LLMs. Our findings show: 1) Top LLMs like GPT-4 and Claude 3 detect errors made by LLMs at very low recall, and all LLM-based error detectors perform much worse than humans. 2) Explanations by LLM-based error detectors lack reliability. 3) LLMs-based error detection is sensitive to small changes in prompts but remains challenging to improve. 4) Popular approaches to improving LLMs, including self-consistency and majority vote, do not improve the error detection performance. Our benchmark and code are provided at https://github.com/psunlpgroup/ReaLMistake.

* Benchmark and code: https://github.com/psunlpgroup/ReaLMistake

Via

Access Paper or Ask Questions

Movable Antenna-Aided Hybrid Beamforming for Multi-User Communications

Apr 01, 2024

Yichi Zhang, Yuchen Zhang, Lipeng Zhu, Sa Xiao, Wanbin Tang, Yonina C. Eldar, Rui Zhang

Abstract:In this correspondence, we propose a movable antenna (MA)-aided multi-user hybrid beamforming scheme with a sub-connected structure, where multiple movable sub-arrays can independently change their positions within different local regions. To maximize the system sum rate, we jointly optimize the digital beamformer, analog beamformer, and positions of subarrays, under the constraints of unit modulus, finite movable regions, and power budget. Due to the non-concave/non-convex objective function/constraints, as well as the highly coupled variables, the formulated problem is challenging to solve. By employing fractional programming, we develop an alternating optimization framework to solve the problem via a combination of Lagrange multipliers, penalty method, and gradient descent. Numerical results reveal that the proposed MA-aided hybrid beamforming scheme significantly improves the sum rate compared to its fixed-position antenna (FPA) counterpart. Moreover, with sufficiently large movable regions, the proposed scheme with sub-connected MA arrays even outperforms the fully-connected FPA array.

Via

Access Paper or Ask Questions

Multimodal Pretraining, Adaptation, and Generation for Recommendation: A Survey

Mar 31, 2024

Qijiong Liu, Jieming Zhu, Yanting Yang, Quanyu Dai, Zhaocheng Du, Xiao-Ming Wu, Zhou Zhao, Rui Zhang, Zhenhua Dong

Figure 1 for Multimodal Pretraining, Adaptation, and Generation for Recommendation: A Survey

Abstract:Personalized recommendation serves as a ubiquitous channel for users to discover information or items tailored to their interests. However, traditional recommendation models primarily rely on unique IDs and categorical features for user-item matching, potentially overlooking the nuanced essence of raw item contents across multiple modalities such as text, image, audio, and video. This underutilization of multimodal data poses a limitation to recommender systems, especially in multimedia services like news, music, and short-video platforms. The recent advancements in pretrained multimodal models offer new opportunities and challenges in developing content-aware recommender systems. This survey seeks to provide a comprehensive exploration of the latest advancements and future trajectories in multimodal pretraining, adaptation, and generation techniques, as well as their applications to recommender systems. Furthermore, we discuss open challenges and opportunities for future research in this domain. We hope that this survey, along with our tutorial materials, will inspire further research efforts to advance this evolving landscape.

* Draft version submitted to KDD'2024 tutorials

Via

Access Paper or Ask Questions

Movable-Antenna Position Optimization: A Graph-based Approach

Mar 25, 2024

Weidong Mei, Xin Wei, Boyu Ning, Zhi Chen, Rui Zhang

Figure 1 for Movable-Antenna Position Optimization: A Graph-based Approach

Figure 2 for Movable-Antenna Position Optimization: A Graph-based Approach

Figure 3 for Movable-Antenna Position Optimization: A Graph-based Approach

Figure 4 for Movable-Antenna Position Optimization: A Graph-based Approach

Abstract:Fluid antennas (FAs) and movable antennas (MAs) have emerged as promising technologies in wireless communications, which offer the flexibility to improve channel conditions by adjusting transmit/receive antenna positions within a spatial region. In this letter, we focus on an MA-enhanced multiple-input single-output (MISO) communication system, aiming to optimize the positions of multiple transmit MAs to maximize the received signal power. Unlike the prior works on continuously searching for the optimal MA positions, we propose to sample the transmit region into discrete points, such that the continuous antenna position optimization problem is transformed to a discrete sampling point selection problem based on the point-wise channel information. However, such a point selection problem is combinatory and challenging to be optimally solved. To tackle this challenge, we ingeniously recast it as an equivalent fixed-hop shortest path problem in graph theory and propose a customized algorithm to solve it optimally in polynomial time. To further reduce the complexity, a linear-time sequential update algorithm is also proposed to obtain a high-quality suboptimal solution. Numerical results demonstrate that the proposed algorithms can yield considerable performance gains over the conventional fixed-position antennas with/without antenna selection.

* 5 pages, 6 figures. We propose a graph-based algorithm that is able to optimally solve the fluid-/movable-antenna position optimization problem in polynomial time

Via

Access Paper or Ask Questions

6D Movable Antenna Enhanced Wireless Network Via Discrete Position and Rotation Optimization

Mar 25, 2024

Xiaodan Shao, Rui Zhang, Qijun Jiang, Robert Schober

Figure 1 for 6D Movable Antenna Enhanced Wireless Network Via Discrete Position and Rotation Optimization

Figure 2 for 6D Movable Antenna Enhanced Wireless Network Via Discrete Position and Rotation Optimization

Figure 3 for 6D Movable Antenna Enhanced Wireless Network Via Discrete Position and Rotation Optimization

Figure 4 for 6D Movable Antenna Enhanced Wireless Network Via Discrete Position and Rotation Optimization

Abstract:Six-dimensional movable antenna (6DMA) is an effective approach to improve wireless network capacity by adjusting the 3D positions and 3D rotations of distributed antenna surfaces based on the users' spatial distribution and statistical channel information. Although continuously positioning/rotating 6DMA surfaces can achieve the greatest flexibility and thus the highest capacity improvement, it is difficult to implement due to the discrete movement constraints of practical stepper motors. Thus, in this paper, we consider a 6DMA-aided base station (BS) with only a finite number of possible discrete positions and rotations for the 6DMA surfaces. We aim to maximize the average network capacity for random numbers of users at random locations by jointly optimizing the 3D positions and 3D rotations of multiple 6DMA surfaces at the BS subject to discrete movement constraints. In particular, we consider the practical cases with and without statistical channel knowledge of the users, and propose corresponding offline and online optimization algorithms, by leveraging the Monte Carlo and conditional sample mean (CSM) methods, respectively. Simulation results verify the effectiveness of our proposed offline and online algorithms for discrete position/rotation optimization of 6DMA surfaces as compared to various benchmark schemes with fixed-position antennas (FPAs) and 6DMAs with limited movability. It is shown that 6DMA-BS can significantly enhance wireless network capacity, even under discrete position/rotation constraints, by exploiting the spatial distribution characteristics of the users.

* 13 pages, double column

Via

Access Paper or Ask Questions

A Survey on Self-Supervised Pre-Training of Graph Foundation Models: A Knowledge-Based Perspective

Mar 24, 2024

Ziwen Zhao, Yuhua Li, Yixiong Zou, Ruixuan Li, Rui Zhang

Abstract:Graph self-supervised learning is now a go-to method for pre-training graph foundation models, including graph neural networks, graph transformers, and more recent large language model (LLM)-based graph models. There is a wide variety of knowledge patterns embedded in the structure and properties of graphs which may be used for pre-training, but we lack a systematic overview of self-supervised pre-training tasks from the perspective of graph knowledge. In this paper, we comprehensively survey and analyze the pre-training tasks of graph foundation models from a knowledge-based perspective, consisting of microscopic (nodes, links, etc) and macroscopic knowledge (clusters, global structure, etc). It covers a total of 9 knowledge categories and 25 pre-training tasks, as well as various downstream task adaptation strategies. Furthermore, an extensive list of the related papers with detailed metadata is provided at https://github.com/Newiz430/Pretext.

* Work in progress

Via

Access Paper or Ask Questions

Energy Efficient Design of Active STAR-RIS-Aided SWIPT Systems

Mar 23, 2024

Sajad Faramarzi, Hosein Zarini, Sepideh Javadi, Mohammad Robat Mili, Rui Zhang, George K. Karagiannidis, Naofal Al-Dhahir

Abstract:In this paper, we consider the downlink transmission of a multi-antenna base station (BS) supported by an active simultaneously transmitting and reconfigurable intelligent surface (STAR-RIS) to serve single-antenna users via simultaneous wireless information and power transfer (SWIPT). In this context, we formulate an energy efficiency maximisation problem that jointly optimises the gain, element selection and phase shift matrices of the active STAR-RIS, the transmit beamforming of the BS and the power splitting ratio of the users. With respect to the highly coupled and non-convex form of this problem, an alternating optimisation solution approach is proposed, using tools from convex optimisation and reinforcement learning. Specifically, semi-definite relaxation (SDR), difference of concave functions (DC), and fractional programming techniques are employed to transform the non-convex optimisation problem into a convex form for optimising the BS beamforming vector and the power splitting ratio of the SWIPT. Then, by integrating meta-learning with the modified deep deterministic policy gradient (DDPG) and soft actor-critical (SAC) methods, a combinatorial reinforcement learning network is developed to optimise the element selection, gain and phase shift matrices of the active STAR-RIS. Our simulations show the effectiveness of the proposed resource allocation scheme. Furthermore, our proposed active STAR-RIS-based SWIPT system outperforms its passive counterpart by 57% on average.

Via

Access Paper or Ask Questions

6D Movable Antenna Based on User Distribution: Modeling and Optimization

Mar 17, 2024

Xiaodan Shao, Qijun Jiang, Rui Zhang

Figure 1 for 6D Movable Antenna Based on User Distribution: Modeling and Optimization

Figure 2 for 6D Movable Antenna Based on User Distribution: Modeling and Optimization

Figure 3 for 6D Movable Antenna Based on User Distribution: Modeling and Optimization

Figure 4 for 6D Movable Antenna Based on User Distribution: Modeling and Optimization

Abstract:In this paper, we propose a new six-dimensional (6D) movable antenna (6DMA) system for future wireless networks to improve the communication performance. Unlike the traditional fixed-position antenna (FPA) and existing fluid antenna/two-dimensional (2D) movable antenna (FA/2DMA) systems that adjust the positions of antennas only, the proposed 6DMA system consists of distributed antenna surfaces with independently adjustable three-dimensional (3D) positions as well as 3D rotations within a given space. In particular, this paper applies the 6DMA to the base station (BS) in wireless networks to provide full degrees of freedom (DoFs) for the BS to adapt to the dynamic user spatial distribution in the network. However, a challenging new problem arises on how to optimally control the 6D positions and rotations of all 6DMA surfaces at the BS to maximize the network capacity based on the user spatial distribution, subject to the practical constraints on 6D antennas' movement. To tackle this problem, we first model the 6DMA-enabled BS and the user channels with the BS in terms of 6D positions and rotations of all 6DMA surfaces. Next, we propose an efficient alternating optimization algorithm to search for the best 6D positions and rotations of all 6DMA surfaces by leveraging the Monte Carlo simulation technique. Specifically, we sequentially optimize the 3D position/3D rotation of each 6DMA surface with those of the other surfaces fixed in an iterative manner. Numerical results show that our proposed 6DMA-BS can significantly improve the network capacity as compared to the benchmark BS architectures with FPAs or 6DMAs with limited/partial movability, especially when the user distribution is more spatially non-uniform.

* Double column, 14 pages

Via

Access Paper or Ask Questions