Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shuang Yang

Tile Networks: Learning Optimal Geometric Layout for Whole-page Recommendation

Mar 03, 2023

Shuai Xiao, Zaifan Jiang, Shuang Yang

Figure 1 for Tile Networks: Learning Optimal Geometric Layout for Whole-page Recommendation

Figure 2 for Tile Networks: Learning Optimal Geometric Layout for Whole-page Recommendation

Figure 3 for Tile Networks: Learning Optimal Geometric Layout for Whole-page Recommendation

Figure 4 for Tile Networks: Learning Optimal Geometric Layout for Whole-page Recommendation

Abstract:Finding optimal configurations in a geometric space is a key challenge in many technological disciplines. Current approaches either rely heavily on human domain expertise and are difficult to scale. In this paper we show it is possible to solve configuration optimization problems for whole-page recommendation using reinforcement learning. The proposed \textit{Tile Networks} is a neural architecture that optimizes 2D geometric configurations by arranging items on proper positions. Empirical results on real dataset demonstrate its superior performance compared to traditional learning to rank approaches and recent deep models.

* Published at Proceedings of the 25th International Conference on Artificial Intelligence and Statistics (AISTATS) 2022

Via

Access Paper or Ask Questions

Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing

Mar 02, 2023

Shuai Xiao, Le Guo, Zaifan Jiang, Lei Lv, Yuanbo Chen, Jun Zhu, Shuang Yang

Figure 1 for Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing

Figure 2 for Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing

Figure 3 for Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing

Figure 4 for Model-based Constrained MDP for Budget Allocation in Sequential Incentive Marketing

Abstract:Sequential incentive marketing is an important approach for online businesses to acquire customers, increase loyalty and boost sales. How to effectively allocate the incentives so as to maximize the return (e.g., business objectives) under the budget constraint, however, is less studied in the literature. This problem is technically challenging due to the facts that 1) the allocation strategy has to be learned using historically logged data, which is counterfactual in nature, and 2) both the optimality and feasibility (i.e., that cost cannot exceed budget) needs to be assessed before being deployed to online systems. In this paper, we formulate the problem as a constrained Markov decision process (CMDP). To solve the CMDP problem with logged counterfactual data, we propose an efficient learning algorithm which combines bisection search and model-based planning. First, the CMDP is converted into its dual using Lagrangian relaxation, which is proved to be monotonic with respect to the dual variable. Furthermore, we show that the dual problem can be solved by policy learning, with the optimal dual variable being found efficiently via bisection search (i.e., by taking advantage of the monotonicity). Lastly, we show that model-based planing can be used to effectively accelerate the joint optimization process without retraining the policy for every dual variable. Empirical results on synthetic and real marketing datasets confirm the effectiveness of our methods.

* Published at CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Via

Access Paper or Ask Questions

UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022

Jun 22, 2022

Yuanhang Zhang, Susan Liang, Shuang Yang, Shiguang Shan

Figure 1 for UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022

Figure 2 for UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022

Figure 3 for UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022

Figure 4 for UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022

Abstract:This report presents a brief description of our winning solution to the AVA Active Speaker Detection (ASD) task at ActivityNet Challenge 2022. Our underlying model UniCon+ continues to build on our previous work, the Unified Context Network (UniCon) and Extended UniCon which are designed for robust scene-level ASD. We augment the architecture with a simple GRU-based module that allows information of recurring identities to flow across scenes through read and update operations. We report a best result of 94.47% mAP on the AVA-ActiveSpeaker test set, which continues to rank first on this year's challenge leaderboard and significantly pushes the state-of-the-art.

* 5 pages, 3 figures; technical report for AVA Challenge (see https://research.google.com/ava/challenge.html) at the International Challenge on Activity Recognition (ActivityNet), CVPR 2022

Via

Access Paper or Ask Questions

UniCon: Unified Context Network for Robust Active Speaker Detection

Aug 05, 2021

Yuanhang Zhang, Susan Liang, Shuang Yang, Xiao Liu, Zhongqin Wu, Shiguang Shan, Xilin Chen

Figure 1 for UniCon: Unified Context Network for Robust Active Speaker Detection

Figure 2 for UniCon: Unified Context Network for Robust Active Speaker Detection

Figure 3 for UniCon: Unified Context Network for Robust Active Speaker Detection

Figure 4 for UniCon: Unified Context Network for Robust Active Speaker Detection

Abstract:We introduce a new efficient framework, the Unified Context Network (UniCon), for robust active speaker detection (ASD). Traditional methods for ASD usually operate on each candidate's pre-cropped face track separately and do not sufficiently consider the relationships among the candidates. This potentially limits performance, especially in challenging scenarios with low-resolution faces, multiple candidates, etc. Our solution is a novel, unified framework that focuses on jointly modeling multiple types of contextual information: spatial context to indicate the position and scale of each candidate's face, relational context to capture the visual relationships among the candidates and contrast audio-visual affinities with each other, and temporal context to aggregate long-term information and smooth out local uncertainties. Based on such information, our model optimizes all candidates in a unified process for robust and reliable ASD. A thorough ablation study is performed on several challenging ASD benchmarks under different settings. In particular, our method outperforms the state-of-the-art by a large margin of about 15% mean Average Precision (mAP) absolute on two challenging subsets: one with three candidate speakers, and the other with faces smaller than 64 pixels. Together, our UniCon achieves 92.0% mAP on the AVA-ActiveSpeaker validation set, surpassing 90% for the first time on this challenging dataset at the time of submission. Project website: https://unicon-asd.github.io/.

* 10 pages, 6 figures; to appear at ACM Multimedia 2021

Via

Access Paper or Ask Questions

Progressive-Scale Boundary Blackbox Attack via Projective Gradient Estimation

Jun 10, 2021

Jiawei Zhang, Linyi Li, Huichen Li, Xiaolu Zhang, Shuang Yang, Bo Li

Figure 1 for Progressive-Scale Boundary Blackbox Attack via Projective Gradient Estimation

Figure 2 for Progressive-Scale Boundary Blackbox Attack via Projective Gradient Estimation

Figure 3 for Progressive-Scale Boundary Blackbox Attack via Projective Gradient Estimation

Figure 4 for Progressive-Scale Boundary Blackbox Attack via Projective Gradient Estimation

Abstract:Boundary based blackbox attack has been recognized as practical and effective, given that an attacker only needs to access the final model prediction. However, the query efficiency of it is in general high especially for high dimensional image data. In this paper, we show that such efficiency highly depends on the scale at which the attack is applied, and attacking at the optimal scale significantly improves the efficiency. In particular, we propose a theoretical framework to analyze and show three key characteristics to improve the query efficiency. We prove that there exists an optimal scale for projective gradient estimation. Our framework also explains the satisfactory performance achieved by existing boundary black-box attacks. Based on our theoretical framework, we propose Progressive-Scale enabled projective Boundary Attack (PSBA) to improve the query efficiency via progressive scaling techniques. In particular, we employ Progressive-GAN to optimize the scale of projections, which we call PSBA-PGAN. We evaluate our approach on both spatial and frequency scales. Extensive experiments on MNIST, CIFAR-10, CelebA, and ImageNet against different models including a real-world face recognition API show that PSBA-PGAN significantly outperforms existing baseline attacks in terms of query efficiency and attack success rate. We also observe relatively stable optimal scales for different models and datasets. The code is publicly available at https://github.com/AI-secure/PSBA.

* ICML 2021

Via

Access Paper or Ask Questions

A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs

Jun 09, 2021

Runzhong Wang, Zhigang Hua, Gan Liu, Jiayi Zhang, Junchi Yan, Feng Qi, Shuang Yang, Jun Zhou, Xiaokang Yang

Figure 1 for A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs

Figure 2 for A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs

Figure 3 for A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs

Figure 4 for A Bi-Level Framework for Learning to Solve Combinatorial Optimization on Graphs

Abstract:Combinatorial Optimization (CO) has been a long-standing challenging research topic featured by its NP-hard nature. Traditionally such problems are approximately solved with heuristic algorithms which are usually fast but may sacrifice the solution quality. Currently, machine learning for combinatorial optimization (MLCO) has become a trending research topic, but most existing MLCO methods treat CO as a single-level optimization by directly learning the end-to-end solutions, which are hard to scale up and mostly limited by the capacity of ML models given the high complexity of CO. In this paper, we propose a hybrid approach to combine the best of the two worlds, in which a bi-level framework is developed with an upper-level learning method to optimize the graph (e.g. add, delete or modify edges in a graph), fused with a lower-level heuristic algorithm solving on the optimized graph. Such a bi-level approach simplifies the learning on the original hard CO and can effectively mitigate the demand for model capacity. The experiments and results on several popular CO problems like Directed Acyclic Graph scheduling, Graph Edit Distance and Hamiltonian Cycle Problem show its effectiveness over manually designed heuristics and single-level learning methods.

Via

Access Paper or Ask Questions

Learning to Schedule DAG Tasks

Mar 05, 2021

Zhigang Hua, Feng Qi, Gan Liu, Shuang Yang

Figure 1 for Learning to Schedule DAG Tasks

Figure 2 for Learning to Schedule DAG Tasks

Figure 3 for Learning to Schedule DAG Tasks

Figure 4 for Learning to Schedule DAG Tasks

Abstract:Scheduling computational tasks represented by directed acyclic graphs (DAGs) is challenging because of its complexity. Conventional scheduling algorithms rely heavily on simple heuristics such as shortest job first (SJF) and critical path (CP), and are often lacking in scheduling quality. In this paper, we present a novel learning-based approach to scheduling DAG tasks. The algorithm employs a reinforcement learning agent to iteratively add directed edges to the DAG, one at a time, to enforce ordering (i.e., priorities of execution and resource allocation) of "tricky" job nodes. By doing so, the original DAG scheduling problem is dramatically reduced to a much simpler proxy problem, on which heuristic scheduling algorithms such as SJF and CP can be efficiently improved. Our approach can be easily applied to any existing heuristic scheduling algorithms. On the benchmark dataset of TPC-H, we show that our learning based approach can significantly improve over popular heuristic algorithms and consistently achieves the best performance among several methods under a variety of settings.

Via

Access Paper or Ask Questions

Nonlinear Projection Based Gradient Estimation for Query Efficient Blackbox Attacks

Feb 25, 2021

Huichen Li, Linyi Li, Xiaojun Xu, Xiaolu Zhang, Shuang Yang, Bo Li

Figure 1 for Nonlinear Projection Based Gradient Estimation for Query Efficient Blackbox Attacks

Figure 2 for Nonlinear Projection Based Gradient Estimation for Query Efficient Blackbox Attacks

Figure 3 for Nonlinear Projection Based Gradient Estimation for Query Efficient Blackbox Attacks

Figure 4 for Nonlinear Projection Based Gradient Estimation for Query Efficient Blackbox Attacks

Abstract:Gradient estimation and vector space projection have been studied as two distinct topics. We aim to bridge the gap between the two by investigating how to efficiently estimate gradient based on a projected low-dimensional space. We first provide lower and upper bounds for gradient estimation under both linear and nonlinear projections, and outline checkable sufficient conditions under which one is better than the other. Moreover, we analyze the query complexity for the projection-based gradient estimation and present a sufficient condition for query-efficient estimators. Built upon our theoretic analysis, we propose a novel query-efficient Nonlinear Gradient Projection-based Boundary Blackbox Attack (NonLinear-BA). We conduct extensive experiments on four image datasets: ImageNet, CelebA, CIFAR-10, and MNIST, and show the superiority of the proposed methods compared with the state-of-the-art baselines. In particular, we show that the projection-based boundary blackbox attacks are able to achieve much smaller magnitude of perturbations with 100% attack success rate based on efficient queries. Both linear and nonlinear projections demonstrate their advantages under different conditions. We also evaluate NonLinear-BA against the commercial online API MEGVII Face++, and demonstrate the high blackbox attack performance both quantitatively and qualitatively. The code is publicly available at https://github.com/AI-secure/NonLinear-BA.

* Accepted by AISTATS 2021; 9 pages excluding references and appendices

Via

Access Paper or Ask Questions

Learn an Effective Lip Reading Model without Pains

Nov 15, 2020

Dalu Feng, Shuang Yang, Shiguang Shan, Xilin Chen

Figure 1 for Learn an Effective Lip Reading Model without Pains

Figure 2 for Learn an Effective Lip Reading Model without Pains

Figure 3 for Learn an Effective Lip Reading Model without Pains

Figure 4 for Learn an Effective Lip Reading Model without Pains

Abstract:Lip reading, also known as visual speech recognition, aims to recognize the speech content from videos by analyzing the lip dynamics. There have been several appealing progress in recent years, benefiting much from the rapidly developed deep learning techniques and the recent large-scale lip-reading datasets. Most existing methods obtained high performance by constructing a complex neural network, together with several customized training strategies which were always given in a very brief description or even shown only in the source code. We find that making proper use of these strategies could always bring exciting improvements without changing much of the model. Considering the non-negligible effects of these strategies and the existing tough status to train an effective lip reading model, we perform a comprehensive quantitative study and comparative analysis, for the first time, to show the effects of several different choices for lip reading. By only introducing some easy-to-get refinements to the baseline pipeline, we obtain an obvious improvement of the performance from 83.7% to 88.4% and from 38.2% to 55.7% on two largest public available lip reading datasets, LRW and LRW-1000, respectively. They are comparable and even surpass the existing state-of-the-art results.

Via

Access Paper or Ask Questions

Learning (Re-)Starting Solutions for Vehicle Routing Problems

Aug 08, 2020

Xingwen Zhang, Shuang Yang

Figure 1 for Learning (Re-)Starting Solutions for Vehicle Routing Problems

Figure 2 for Learning (Re-)Starting Solutions for Vehicle Routing Problems

Figure 3 for Learning (Re-)Starting Solutions for Vehicle Routing Problems

Figure 4 for Learning (Re-)Starting Solutions for Vehicle Routing Problems

Abstract:A key challenge in solving a combinatorial optimization problem is how to guide the agent (i.e., solver) to efficiently explore the enormous search space. Conventional approaches often rely on enumeration (e.g., exhaustive, random, or tabu search) or have to restrict the exploration to rather limited regions (e.g., a single path as in iterative algorithms). In this paper, we show it is possible to use machine learning to speedup the exploration. In particular, a value network is trained to evaluate solution candidates, which provides a useful structure (i.e., an approximate value surface) over the search space; this value network is then used to screen solutions to help a black-box optimization agent to initialize or restart so as to navigate through the search space towards desirable solutions. Experiments demonstrate that the proposed ``Learn to Restart'' algorithm achieves promising results in solving Capacitated Vehicle Routing Problems (CVRPs).

Via

Access Paper or Ask Questions