Alert button
Picture for Shang Liu

Shang Liu

Alert button

UnifiedSSR: A Unified Framework of Sequential Search and Recommendation

Oct 21, 2023
Jiayi Xie, Shang Liu, Gao Cong, Zhenzhong Chen

In this work, we propose a Unified framework of Sequential Search and Recommendation (UnifiedSSR) for joint learning of user behavior history in both search and recommendation scenarios. Specifically, we consider user-interacted products in the recommendation scenario, user-interacted products and user-issued queries in the search scenario as three distinct types of user behaviors. We propose a dual-branch network to encode the pair of interacted product history and issued query history in the search scenario in parallel. This allows for cross-scenario modeling by deactivating the query branch for the recommendation scenario. Through the parameter sharing between dual branches, as well as between product branches in two scenarios, we incorporate cross-view and cross-scenario associations of user behaviors, providing a comprehensive understanding of user behavior patterns. To further enhance user behavior modeling by capturing the underlying dynamic intent, an Intent-oriented Session Modeling module is designed for inferring intent-oriented semantic sessions from the contextual information in behavior sequences. In particular, we consider self-supervised learning signals from two perspectives for intent-oriented semantic session locating, which encourage session discrimination within each behavior sequence and session alignment between dual behavior sequences. Extensive experiments on three public datasets demonstrate that UnifiedSSR consistently outperforms state-of-the-art methods for both search and recommendation.

Viaarxiv icon

RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model

Aug 10, 2023
Yao Lu, Shang Liu, Qijun Zhang, Zhiyao Xie

Figure 1 for RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model
Figure 2 for RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model
Figure 3 for RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model
Figure 4 for RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model

Inspired by the recent success of large language models (LLMs) like ChatGPT, researchers start to explore the adoption of LLMs for agile hardware design, such as generating design RTL based on natural-language instructions. However, in existing works, their target designs are all relatively simple and in a small scale, and proposed by the authors themselves, making a fair comparison among different LLM solutions challenging. In addition, many prior works only focus on the design correctness, without evaluating the design qualities of generated design RTL. In this work, we propose an open-source benchmark named RTLLM, for generating design RTL with natural language instructions. To systematically evaluate the auto-generated design RTL, we summarized three progressive goals, named syntax goal, functionality goal, and design quality goal. This benchmark can automatically provide a quantitative evaluation of any given LLM-based solution. Furthermore, we propose an easy-to-use yet surprisingly effective prompt engineering technique named self-planning, which proves to significantly boost the performance of GPT-3.5 in our proposed benchmark.

Viaarxiv icon

Understanding Uncertainty Sampling

Jul 20, 2023
Shang Liu, Xiaocheng Li

Figure 1 for Understanding Uncertainty Sampling
Figure 2 for Understanding Uncertainty Sampling
Figure 3 for Understanding Uncertainty Sampling
Figure 4 for Understanding Uncertainty Sampling

Uncertainty sampling is a prevalent active learning algorithm that queries sequentially the annotations of data samples which the current prediction model is uncertain about. However, the usage of uncertainty sampling has been largely heuristic: (i) There is no consensus on the proper definition of "uncertainty" for a specific task under a specific loss; (ii) There is no theoretical guarantee that prescribes a standard protocol to implement the algorithm, for example, how to handle the sequentially arrived annotated data under the framework of optimization algorithms such as stochastic gradient descent. In this work, we systematically examine uncertainty sampling algorithms under both stream-based and pool-based active learning. We propose a notion of equivalent loss which depends on the used uncertainty measure and the original loss function and establish that an uncertainty sampling algorithm essentially optimizes against such an equivalent loss. The perspective verifies the properness of existing uncertainty measures from two aspects: surrogate property and loss convexity. Furthermore, we propose a new notion for designing uncertainty measures called \textit{loss as uncertainty}. The idea is to use the conditional expected loss given the features as the uncertainty measure. Such an uncertainty measure has nice analytical properties and generality to cover both classification and regression problems, which enable us to provide the first generalization bound for uncertainty sampling algorithms under both stream-based and pool-based settings, in the full generality of the underlying model and problem. Lastly, we establish connections between certain variants of the uncertainty sampling algorithms with risk-sensitive objectives and distributional robustness, which can partly explain the advantage of uncertainty sampling algorithms when the sample size is small.

* Update: add numerical illustrations and experiments; correct some typos and modify the numbering 
Viaarxiv icon

When No-Rejection Learning is Optimal for Regression with Rejection

Jul 06, 2023
Xiaocheng Li, Shang Liu, Chunlin Sun, Hanzhao Wang

Learning with rejection is a prototypical model for studying the interaction between humans and AI on prediction tasks. The model has two components, a predictor and a rejector. Upon the arrival of a sample, the rejector first decides whether to accept it; if accepted, the predictor fulfills the prediction task, and if rejected, the prediction will be deferred to humans. The learning problem requires learning a predictor and a rejector simultaneously. This changes the structure of the conventional loss function and often results in non-convexity and inconsistency issues. For the classification with rejection problem, several works develop surrogate losses for the jointly learning with provable consistency guarantees; in parallel, there has been less work for the regression counterpart. We study the regression with rejection (RwR) problem and investigate the no-rejection learning strategy which treats the RwR problem as a standard regression task to learn the predictor. We establish that the suboptimality of the no-rejection learning strategy observed in the literature can be mitigated by enlarging the function class of the predictor. Then we introduce the truncated loss to single out the learning for the predictor and we show that a consistent surrogate property can be established for the predictor individually in an easier way than for the predictor and the rejector jointly. Our findings advocate for a two-step learning procedure that first uses all the data to learn the predictor and then calibrates the prediction loss for the rejector. It is better aligned with the common intuition that more data samples will lead to a better predictor and it calls for more efforts on a better design of calibration algorithms for learning the rejector. While our discussions mainly focus on the regression problem, the theoretical results and insights generalize to the classification problem as well.

Viaarxiv icon

Distribution-Free Model-Agnostic Regression Calibration via Nonparametric Methods

May 20, 2023
Shang Liu, Zhongze Cai, Xiaocheng Li

Figure 1 for Distribution-Free Model-Agnostic Regression Calibration via Nonparametric Methods
Figure 2 for Distribution-Free Model-Agnostic Regression Calibration via Nonparametric Methods
Figure 3 for Distribution-Free Model-Agnostic Regression Calibration via Nonparametric Methods
Figure 4 for Distribution-Free Model-Agnostic Regression Calibration via Nonparametric Methods

In this paper, we consider the uncertainty quantification problem for regression models. Specifically, we consider an individual calibration objective for characterizing the quantiles of the prediction model. While such an objective is well-motivated from downstream tasks such as newsvendor cost, the existing methods have been largely heuristic and lack of statistical guarantee in terms of individual calibration. We show via simple examples that the existing methods focusing on population-level calibration guarantees such as average calibration or sharpness can lead to harmful and unexpected results. We propose simple nonparametric calibration methods that are agnostic of the underlying prediction model and enjoy both computational efficiency and statistical consistency. Our approach enables a better understanding of the possibility of individual calibration, and we establish matching upper and lower bounds for the calibration error of our proposed methods. Technically, our analysis combines the nonparametric analysis with a covering number argument for parametric analysis, which advances the existing theoretical analyses in the literature of nonparametric density estimation and quantile bandit problems. Importantly, the nonparametric perspective sheds new theoretical insights into regression calibration in terms of the curse of dimensionality and reconciles the existing results on the impossibility of individual calibration. Numerical experiments show the advantage of such a simple approach under various metrics, and also under covariates shift. We hope our work provides a simple benchmark and a starting point of theoretical ground for future research on regression calibration.

Viaarxiv icon

Maximum Optimality Margin: A Unified Approach for Contextual Linear Programming and Inverse Linear Programming

Jan 26, 2023
Chunlin Sun, Shang Liu, Xiaocheng Li

Figure 1 for Maximum Optimality Margin: A Unified Approach for Contextual Linear Programming and Inverse Linear Programming
Figure 2 for Maximum Optimality Margin: A Unified Approach for Contextual Linear Programming and Inverse Linear Programming
Figure 3 for Maximum Optimality Margin: A Unified Approach for Contextual Linear Programming and Inverse Linear Programming
Figure 4 for Maximum Optimality Margin: A Unified Approach for Contextual Linear Programming and Inverse Linear Programming

In this paper, we study the predict-then-optimize problem where the output of a machine learning prediction task is used as the input of some downstream optimization problem, say, the objective coefficient vector of a linear program. The problem is also known as predictive analytics or contextual linear programming. The existing approaches largely suffer from either (i) optimization intractability (a non-convex objective function)/statistical inefficiency (a suboptimal generalization bound) or (ii) requiring strong condition(s) such as no constraint or loss calibration. We develop a new approach to the problem called \textit{maximum optimality margin} which designs the machine learning loss function by the optimality condition of the downstream optimization. The max-margin formulation enjoys both computational efficiency and good theoretical properties for the learning procedure. More importantly, our new approach only needs the observations of the optimal solution in the training data rather than the objective function, which makes it a new and natural approach to the inverse linear programming problem under both contextual and context-free settings; we also analyze the proposed method under both offline and online settings, and demonstrate its performance using numerical experiments.

Viaarxiv icon

Non-stationary Bandits with Knapsacks

May 25, 2022
Shang Liu, Jiashuo Jiang, Xiaocheng Li

Figure 1 for Non-stationary Bandits with Knapsacks

In this paper, we study the problem of bandits with knapsacks (BwK) in a non-stationary environment. The BwK problem generalizes the multi-arm bandit (MAB) problem to model the resource consumption associated with playing each arm. At each time, the decision maker/player chooses to play an arm, and s/he will receive a reward and consume certain amount of resource from each of the multiple resource types. The objective is to maximize the cumulative reward over a finite horizon subject to some knapsack constraints on the resources. Existing works study the BwK problem under either a stochastic or adversarial environment. Our paper considers a non-stationary environment which continuously interpolates between these two extremes. We first show that the traditional notion of variation budget is insufficient to characterize the non-stationarity of the BwK problem for a sublinear regret due to the presence of the constraints, and then we propose a new notion of global non-stationarity measure. We employ both non-stationarity measures to derive upper and lower bounds for the problem. Our results are based on a primal-dual analysis of the underlying linear programs and highlight the interplay between the constraints and the non-stationarity. Finally, we also extend the non-stationarity measure to the problem of online convex optimization with constraints and obtain new regret bounds accordingly.

Viaarxiv icon

Center-of-Mass-based Robust Grasp Pose Adaptation Using RGBD Camera and Force/Torque Sensing

May 02, 2022
Shang Liu, Xiaobao Wei, Lulu Wang, Jing Zhang, Boyu Li, Haosong Yue

Figure 1 for Center-of-Mass-based Robust Grasp Pose Adaptation Using RGBD Camera and Force/Torque Sensing
Figure 2 for Center-of-Mass-based Robust Grasp Pose Adaptation Using RGBD Camera and Force/Torque Sensing
Figure 3 for Center-of-Mass-based Robust Grasp Pose Adaptation Using RGBD Camera and Force/Torque Sensing
Figure 4 for Center-of-Mass-based Robust Grasp Pose Adaptation Using RGBD Camera and Force/Torque Sensing

Object dropping may occur when the robotic arm grasps objects with uneven mass distribution due to additional moments generated by objects' gravity. To solve this problem, we present a novel work that does not require extra wrist and tactile sensors and large amounts of experiments for learning. First, we obtain the center-of-mass position of the rod object using the widely fixed joint torque sensors on the robot arm and RGBD camera. Further, we give the strategy of grasping to improve grasp stability. Simulation experiments are performed in "Mujoco". Results demonstrate that our work is effective in enhancing grasping robustness.

Viaarxiv icon

Points-of-Interest Relationship Inference with Spatial-enriched Graph Neural Networks

Feb 28, 2022
Yile Chen, Xiucheng Li, Gao Cong, Cheng Long, Zhifeng Bao, Shang Liu, Wanli Gu, Fuzheng Zhang

Figure 1 for Points-of-Interest Relationship Inference with Spatial-enriched Graph Neural Networks
Figure 2 for Points-of-Interest Relationship Inference with Spatial-enriched Graph Neural Networks
Figure 3 for Points-of-Interest Relationship Inference with Spatial-enriched Graph Neural Networks
Figure 4 for Points-of-Interest Relationship Inference with Spatial-enriched Graph Neural Networks

As a fundamental component in location-based services, inferring the relationship between points-of-interests (POIs) is very critical for service providers to offer good user experience to business owners and customers. Most of the existing methods for relationship inference are not targeted at POI, thus failing to capture unique spatial characteristics that have huge effects on POI relationships. In this work we propose PRIM to tackle POI relationship inference for multiple relation types. PRIM features four novel components, including a weighted relational graph neural network, category taxonomy integration, a self-attentive spatial context extractor, and a distance-specific scoring function. Extensive experiments on two real-world datasets show that PRIM achieves the best results compared to state-of-the-art baselines and it is robust against data sparsity and is applicable to unseen cases in practice.

Viaarxiv icon