Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yao Wang

Cache-Aware Reinforcement Learning in Large-Scale Recommender Systems

Apr 23, 2024

Xiaoshuang Chen, Gengrui Zhang, Yao Wang, Yulin Wu, Shuo Su, Kaiqiao Zhan, Ben Wang

Abstract:Modern large-scale recommender systems are built upon computation-intensive infrastructure and usually suffer from a huge difference in traffic between peak and off-peak periods. In peak periods, it is challenging to perform real-time computation for each request due to the limited budget of computational resources. The recommendation with a cache is a solution to this problem, where a user-wise result cache is used to provide recommendations when the recommender system cannot afford a real-time computation. However, the cached recommendations are usually suboptimal compared to real-time computation, and it is challenging to determine the items in the cache for each user. In this paper, we provide a cache-aware reinforcement learning (CARL) method to jointly optimize the recommendation by real-time computation and by the cache. We formulate the problem as a Markov decision process with user states and a cache state, where the cache state represents whether the recommender system performs recommendations by real-time computation or by the cache. The computational load of the recommender system determines the cache state. We perform reinforcement learning based on such a model to improve user engagement over multiple requests. Moreover, we show that the cache will introduce a challenge called critic dependency, which deteriorates the performance of reinforcement learning. To tackle this challenge, we propose an eigenfunction learning (EL) method to learn independent critics for CARL. Experiments show that CARL can significantly improve the users' engagement when considering the result cache. CARL has been fully launched in Kwai app, serving over 100 million users.

* 8 pages, 8 figures

Via

Access Paper or Ask Questions

Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems

Apr 17, 2024

Long Cao, Liwei Ge, Daochi Zhang, Xiang Li, Yao Wang, Rui-Xue Xu, YiJing Yan, Xiao Zheng

Figure 1 for Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems

Figure 2 for Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems

Figure 3 for Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems

Figure 4 for Neural Network Approach for Non-Markovian Dissipative Dynamics of Many-Body Open Quantum Systems

Abstract:Simulating the dynamics of open quantum systems coupled to non-Markovian environments remains an outstanding challenge due to exponentially scaling computational costs. We present an artificial intelligence strategy to overcome this obstacle by integrating the neural quantum states approach into the dissipaton-embedded quantum master equation in second quantization (DQME-SQ). Our approach utilizes restricted Boltzmann machines (RBMs) to compactly represent the reduced density tensor, explicitly encoding the combined effects of system-environment correlations and nonMarkovian memory. Applied to model systems exhibiting prominent effects of system-environment correlation and non-Markovian memory, our approach achieves comparable accuracy to conventional hierarchical equations of motion, while requiring significantly fewer dynamical variables. The novel RBM-based DQME-SQ approach paves the way for investigating non-Markovian open quantum dynamics in previously intractable regimes, with implications spanning various frontiers of modern science.

* 7 pages, 5 figures

Via

Access Paper or Ask Questions

One-Click Upgrade from 2D to 3D: Sandwiched RGB-D Video Compression for Stereoscopic Teleconferencing

Apr 15, 2024

Yueyu Hu, Onur G. Guleryuz, Philip A. Chou, Danhang Tang, Jonathan Taylor, Rus Maxham, Yao Wang

Figure 1 for One-Click Upgrade from 2D to 3D: Sandwiched RGB-D Video Compression for Stereoscopic Teleconferencing

Figure 2 for One-Click Upgrade from 2D to 3D: Sandwiched RGB-D Video Compression for Stereoscopic Teleconferencing

Figure 3 for One-Click Upgrade from 2D to 3D: Sandwiched RGB-D Video Compression for Stereoscopic Teleconferencing

Figure 4 for One-Click Upgrade from 2D to 3D: Sandwiched RGB-D Video Compression for Stereoscopic Teleconferencing

Abstract:Stereoscopic video conferencing is still challenging due to the need to compress stereo RGB-D video in real-time. Though hardware implementations of standard video codecs such as H.264 / AVC and HEVC are widely available, they are not designed for stereoscopic videos and suffer from reduced quality and performance. Specific multiview or 3D extensions of these codecs are complex and lack efficient implementations. In this paper, we propose a new approach to upgrade a 2D video codec to support stereo RGB-D video compression, by wrapping it with a neural pre- and post-processor pair. The neural networks are end-to-end trained with an image codec proxy, and shown to work with a more sophisticated video codec. We also propose a geometry-aware loss function to improve rendering quality. We train the neural pre- and post-processors on a synthetic 4D people dataset, and evaluate it on both synthetic and real-captured stereo RGB-D videos. Experimental results show that the neural networks generalize well to unseen data and work out-of-box with various video codecs. Our approach saves about 30% bit-rate compared to a conventional video coding scheme and MV-HEVC at the same level of rendering quality from a novel view, without the need of a task-specific hardware upgrade.

* Accepted by CVPR 2024 Workshop (AIS: Vision, Graphics and AI for Streaming https://ai4streaming-workshop.github.io )

Via

Access Paper or Ask Questions

Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues

Apr 12, 2024

Xianhua He, Dashuang Liang, Song Yang, Zhanlong Hao, Hui Ma, Binjie Mao, Xi Li, Yao Wang, Pengfei Yan, Ajian Liu

Figure 1 for Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues

Figure 2 for Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues

Figure 3 for Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues

Figure 4 for Joint Physical-Digital Facial Attack Detection Via Simulating Spoofing Clues

Abstract:Face recognition systems are frequently subjected to a variety of physical and digital attacks of different types. Previous methods have achieved satisfactory performance in scenarios that address physical attacks and digital attacks, respectively. However, few methods are considered to integrate a model that simultaneously addresses both physical and digital attacks, implying the necessity to develop and maintain multiple models. To jointly detect physical and digital attacks within a single model, we propose an innovative approach that can adapt to any network architecture. Our approach mainly contains two types of data augmentation, which we call Simulated Physical Spoofing Clues augmentation (SPSC) and Simulated Digital Spoofing Clues augmentation (SDSC). SPSC and SDSC augment live samples into simulated attack samples by simulating spoofing clues of physical and digital attacks, respectively, which significantly improve the capability of the model to detect "unseen" attack types. Extensive experiments show that SPSC and SDSC can achieve state-of-the-art generalization in Protocols 2.1 and 2.2 of the UniAttackData dataset, respectively. Our method won first place in "Unified Physical-Digital Face Attack Detection" of the 5th Face Anti-spoofing Challenge@CVPR2024. Our final submission obtains 3.75% APCER, 0.93% BPCER, and 2.34% ACER, respectively. Our code is available at https://github.com/Xianhua-He/cvpr2024-face-anti-spoofing-challenge.

* 10 pages with 6 figures, Accepted by CVPRW 2024

Via

Access Paper or Ask Questions

DiffGaze: A Diffusion Model for Continuous Gaze Sequence Generation on 360° Images

Mar 26, 2024

Chuhan Jiao, Yao Wang, Guanhua Zhang, Mihai Bâce, Zhiming Hu, Andreas Bulling

Figure 1 for DiffGaze: A Diffusion Model for Continuous Gaze Sequence Generation on 360° Images

Figure 2 for DiffGaze: A Diffusion Model for Continuous Gaze Sequence Generation on 360° Images

Figure 3 for DiffGaze: A Diffusion Model for Continuous Gaze Sequence Generation on 360° Images

Figure 4 for DiffGaze: A Diffusion Model for Continuous Gaze Sequence Generation on 360° Images

Abstract:We present DiffGaze, a novel method for generating realistic and diverse continuous human gaze sequences on 360{\deg} images based on a conditional score-based denoising diffusion model. Generating human gaze on 360{\deg} images is important for various human-computer interaction and computer graphics applications, e.g. for creating large-scale eye tracking datasets or for realistic animation of virtual humans. However, existing methods are limited to predicting discrete fixation sequences or aggregated saliency maps, thereby neglecting crucial parts of natural gaze behaviour. Our method uses features extracted from 360{\deg} images as condition and uses two transformers to model the temporal and spatial dependencies of continuous human gaze. We evaluate DiffGaze on two 360{\deg} image benchmarks for gaze sequence generation as well as scanpath prediction and saliency prediction. Our evaluations show that DiffGaze outperforms state-of-the-art methods on all tasks on both benchmarks. We also report a 21-participant user study showing that our method generates gaze sequences that are indistinguishable from real human sequences.

Via

Access Paper or Ask Questions

Interactive $360^{\circ}$ Video Streaming Using FoV-Adaptive Coding with Temporal Prediction

Mar 17, 2024

Yixiang Mao, Liyang Sun, Yong Liu, Yao Wang

$Figure 1 for Interactive $360^{\circ}$ Video Streaming Using FoV-Adaptive Coding with Temporal Prediction$

$Figure 2 for Interactive $360^{\circ}$ Video Streaming Using FoV-Adaptive Coding with Temporal Prediction$

$Figure 3 for Interactive $360^{\circ}$ Video Streaming Using FoV-Adaptive Coding with Temporal Prediction$

$Figure 4 for Interactive $360^{\circ}$ Video Streaming Using FoV-Adaptive Coding with Temporal Prediction$

Abstract:For $360^{\circ}$ video streaming, FoV-adaptive coding that allocates more bits for the predicted user's field of view (FoV) is an effective way to maximize the rendered video quality under the limited bandwidth. We develop a low-latency FoV-adaptive coding and streaming system for interactive applications that is robust to bandwidth variations and FoV prediction errors. To minimize the end-to-end delay and yet maximize the coding efficiency, we propose a frame-level FoV-adaptive inter-coding structure. In each frame, regions that are in or near the predicted FoV are coded using temporal and spatial prediction, while a small rotating region is coded with spatial prediction only. This rotating intra region periodically refreshes the entire frame, thereby providing robustness to both FoV prediction errors and frame losses due to transmission errors. The system adapts the sizes and rates of different regions for each video segment to maximize the rendered video quality under the predicted bandwidth constraint. Integrating such frame-level FoV adaptation with temporal prediction is challenging due to the temporal variations of the FoV. We propose novel ways for modeling the influence of FoV dynamics on the quality-rate performance of temporal predictive coding.We further develop LSTM-based machine learning models to predict the user's FoV and network bandwidth.The proposed system is compared with three benchmark systems, using real-world network bandwidth traces and FoV traces, and is shown to significantly improve the rendered video quality, while achieving very low end-to-end delay and low frame-freeze probability.

Via

Access Paper or Ask Questions

DeepEMC-T2 Mapping: Deep Learning-Enabled T2 Mapping Based on Echo Modulation Curve Modeling

Feb 29, 2024

Haoyang Pei, Timothy M. Shepherd, Yao Wang, Fang Liu, Daniel K Sodickson, Noam Ben-Eliezer, Li Feng

Abstract:Purpose: Echo modulation curve (EMC) modeling can provide accurate and reproducible quantification of T2 relaxation times. The standard EMC-T2 mapping framework, however, requires sufficient echoes and cumbersome pixel-wise dictionary-matching steps. This work proposes a deep learning version of EMC-T2 mapping, called DeepEMC-T2 mapping, to efficiently estimate accurate T2 maps from fewer echoes without a dictionary. Methods: DeepEMC-T2 mapping was developed using a modified U-Net to estimate both T2 and Proton Density (PD) maps directly from multi-echo spin-echo (MESE) images. The modified U-Net employs several new features to improve the accuracy of T2/PD estimation. MESE datasets from 68 subjects were used for training and evaluation of the DeepEMC-T2 mapping technique. Multiple experiments were conducted to evaluate the impact of the proposed new features on DeepEMC-T2 mapping. Results: DeepEMC-T2 mapping achieved T2 estimation errors ranging from 3%-12% in different T2 ranges and 0.8%-1.7% for PD estimation with 10/7/5/3 echoes, which yielded more accurate parameter estimation than standard EMC-T2 mapping. The new features proposed in DeepEMC-T2 mapping enabled improved parameter estimation. The use of a larger echo spacing with fewer echoes can maintain the accuracy of T2 and PD estimations while reducing the number of 180-degree refocusing pulses. Conclusions: DeepEMC-T2 mapping enables simplified, efficient, and accurate T2 quantification directly from MESE images without a time-consuming dictionary-matching step and requires fewer echoes. This allows for increased volumetric coverage and/or decreased SAR by reducing the number of 180-degree refocusing pulses.

Via

Access Paper or Ask Questions

Low-Tubal-Rank Tensor Recovery via Factorized Gradient Descent

Feb 03, 2024

Zhiyu Liu, Zhi Han, Yandong Tang, Xi-Le Zhao, Yao Wang

Figure 1 for Low-Tubal-Rank Tensor Recovery via Factorized Gradient Descent

Figure 2 for Low-Tubal-Rank Tensor Recovery via Factorized Gradient Descent

Figure 3 for Low-Tubal-Rank Tensor Recovery via Factorized Gradient Descent

Figure 4 for Low-Tubal-Rank Tensor Recovery via Factorized Gradient Descent

Abstract:This paper considers the problem of recovering a tensor with an underlying low-tubal-rank structure from a small number of corrupted linear measurements. Traditional approaches tackling such a problem require the computation of tensor Singular Value Decomposition (t-SVD), that is a computationally intensive process, rendering them impractical for dealing with large-scale tensors. Aim to address this challenge, we propose an efficient and effective low-tubal-rank tensor recovery method based on a factorization procedure akin to the Burer-Monteiro (BM) method. Precisely, our fundamental approach involves decomposing a large tensor into two smaller factor tensors, followed by solving the problem through factorized gradient descent (FGD). This strategy eliminates the need for t-SVD computation, thereby reducing computational costs and storage requirements. We provide rigorous theoretical analysis to ensure the convergence of FGD under both noise-free and noisy situations. Additionally, it is worth noting that our method does not require the precise estimation of the tensor tubal-rank. Even in cases where the tubal-rank is slightly overestimated, our approach continues to demonstrate robust performance. A series of experiments have been carried out to demonstrate that, as compared to other popular ones, our approach exhibits superior performance in multiple scenarios, in terms of the faster computational speed and the smaller convergence error.

* 13 pages, 4 figures

Via

Access Paper or Ask Questions

UNEX-RL: Reinforcing Long-Term Rewards in Multi-Stage Recommender Systems with UNidirectional EXecution

Jan 12, 2024

Gengrui Zhang, Yao Wang, Xiaoshuang Chen, Hongyi Qian, Kaiqiao Zhan, Ben Wang

Abstract:In recent years, there has been a growing interest in utilizing reinforcement learning (RL) to optimize long-term rewards in recommender systems. Since industrial recommender systems are typically designed as multi-stage systems, RL methods with a single agent face challenges when optimizing multiple stages simultaneously. The reason is that different stages have different observation spaces, and thus cannot be modeled by a single agent. To address this issue, we propose a novel UNidirectional-EXecution-based multi-agent Reinforcement Learning (UNEX-RL) framework to reinforce the long-term rewards in multi-stage recommender systems. We show that the unidirectional execution is a key feature of multi-stage recommender systems, bringing new challenges to the applications of multi-agent reinforcement learning (MARL), namely the observation dependency and the cascading effect. To tackle these challenges, we provide a cascading information chain (CIC) method to separate the independent observations from action-dependent observations and use CIC to train UNEX-RL effectively. We also discuss practical variance reduction techniques for UNEX-RL. Finally, we show the effectiveness of UNEX-RL on both public datasets and an online recommender system with over 100 million users. Specifically, UNEX-RL reveals a 0.558% increase in users' usage time compared with single-agent RL algorithms in online A/B experiments, highlighting the effectiveness of UNEX-RL in industrial recommender systems.

* Accepted by AAAI2024

Via

Access Paper or Ask Questions

Efficient Generalized Low-Rank Tensor Contextual Bandits

Nov 09, 2023

Qianxin Yi, Yiyang Yang, Shaojie Tang, Yao Wang

Figure 1 for Efficient Generalized Low-Rank Tensor Contextual Bandits

Figure 2 for Efficient Generalized Low-Rank Tensor Contextual Bandits

Figure 3 for Efficient Generalized Low-Rank Tensor Contextual Bandits

Figure 4 for Efficient Generalized Low-Rank Tensor Contextual Bandits

Abstract:In this paper, we aim to build a novel bandits algorithm that is capable of fully harnessing the power of multi-dimensional data and the inherent non-linearity of reward functions to provide high-usable and accountable decision-making services. To this end, we introduce a generalized low-rank tensor contextual bandits model in which an action is formed from three feature vectors, and thus can be represented by a tensor. In this formulation, the reward is determined through a generalized linear function applied to the inner product of the action's feature tensor and a fixed but unknown parameter tensor with a low tubal rank. To effectively achieve the trade-off between exploration and exploitation, we introduce a novel algorithm called "Generalized Low-Rank Tensor Exploration Subspace then Refine" (G-LowTESTR). This algorithm first collects raw data to explore the intrinsic low-rank tensor subspace information embedded in the decision-making scenario, and then converts the original problem into an almost lower-dimensional generalized linear contextual bandits problem. Rigorous theoretical analysis shows that the regret bound of G-LowTESTR is superior to those in vectorization and matricization cases. We conduct a series of simulations and real data experiments to further highlight the effectiveness of G-LowTESTR, leveraging its ability to capitalize on the low-rank tensor structure for enhanced learning.

Via

Access Paper or Ask Questions