Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ziqi Zhang

Peking University

Towards Scalability and Extensibility of Query Reformulation Modeling in E-commerce Search

Feb 17, 2024

Ziqi Zhang, Yupin Huang, Quan Deng, Jinghui Xiao, Vivek Mittal, Jingyuan Deng

Figure 1 for Towards Scalability and Extensibility of Query Reformulation Modeling in E-commerce Search

Figure 2 for Towards Scalability and Extensibility of Query Reformulation Modeling in E-commerce Search

Figure 3 for Towards Scalability and Extensibility of Query Reformulation Modeling in E-commerce Search

Figure 4 for Towards Scalability and Extensibility of Query Reformulation Modeling in E-commerce Search

Abstract:Customer behavioral data significantly impacts e-commerce search systems. However, in the case of less common queries, the associated behavioral data tends to be sparse and noisy, offering inadequate support to the search mechanism. To address this challenge, the concept of query reformulation has been introduced. It suggests that less common queries could utilize the behavior patterns of their popular counterparts with similar meanings. In Amazon product search, query reformulation has displayed its effectiveness in improving search relevance and bolstering overall revenue. Nonetheless, adapting this method for smaller or emerging businesses operating in regions with lower traffic and complex multilingual settings poses the challenge in terms of scalability and extensibility. This study focuses on overcoming this challenge by constructing a query reformulation solution capable of functioning effectively, even when faced with limited training data, in terms of quality and scale, along with relatively complex linguistic characteristics. In this paper we provide an overview of the solution implemented within Amazon product search infrastructure, which encompasses a range of elements, including refining the data mining process, redefining model training objectives, and reshaping training strategies. The effectiveness of the proposed solution is validated through online A/B testing on search ranking and Ads matching. Notably, employing the proposed solution in search ranking resulted in 0.14% and 0.29% increase in overall revenue in Japanese and Hindi cases, respectively, and a 0.08\% incremental gain in the English case compared to the legacy implementation; while in search Ads matching led to a 0.36% increase in Ads revenue in the Japanese case.

Via

Access Paper or Ask Questions

Context-Former: Stitching via Latent Conditioned Sequence Modeling

Feb 03, 2024

Ziqi Zhang, Jingzehua Xu, Jinxin Liu, Zifeng Zhuang, Donglin Wang

Figure 1 for Context-Former: Stitching via Latent Conditioned Sequence Modeling

Figure 2 for Context-Former: Stitching via Latent Conditioned Sequence Modeling

Figure 3 for Context-Former: Stitching via Latent Conditioned Sequence Modeling

Figure 4 for Context-Former: Stitching via Latent Conditioned Sequence Modeling

Abstract:Offline reinforcement learning (RL) algorithms can improve the decision making via stitching sub-optimal trajectories to obtain more optimal ones. This capability is a crucial factor in enabling RL to learn policies that are superior to the behavioral policy. On the other hand, Decision Transformer (DT) abstracts the decision-making as sequence modeling, showcasing competitive performance on offline RL benchmarks, however, recent studies demonstrate that DT lacks of stitching capability, thus exploit stitching capability for DT is vital to further improve its performance. In order to endow stitching capability to DT, we abstract trajectory stitching as expert matching and introduce our approach, ContextFormer, which integrates contextual information-based imitation learning (IL) and sequence modeling to stitch sub-optimal trajectory fragments by emulating the representations of a limited number of expert trajectories. To validate our claim, we conduct experiments from two perspectives: 1) We conduct extensive experiments on D4RL benchmarks under the settings of IL, and experimental results demonstrate ContextFormer can achieve competitive performance in multi-IL settings. 2) More importantly, we conduct a comparison of ContextFormer with diverse competitive DT variants using identical training datasets. The experimental results unveiled ContextFormer's superiority, as it outperformed all other variants, showcasing its remarkable performance.

Via

Access Paper or Ask Questions

Set Prediction Guided by Semantic Concepts for Diverse Video Captioning

Dec 25, 2023

Yifan Lu, Ziqi Zhang, Chunfeng Yuan, Peng Li, Yan Wang, Bing Li, Weiming Hu

Figure 1 for Set Prediction Guided by Semantic Concepts for Diverse Video Captioning

Figure 2 for Set Prediction Guided by Semantic Concepts for Diverse Video Captioning

Figure 3 for Set Prediction Guided by Semantic Concepts for Diverse Video Captioning

Figure 4 for Set Prediction Guided by Semantic Concepts for Diverse Video Captioning

Abstract:Diverse video captioning aims to generate a set of sentences to describe the given video in various aspects. Mainstream methods are trained with independent pairs of a video and a caption from its ground-truth set without exploiting the intra-set relationship, resulting in low diversity of generated captions. Different from them, we formulate diverse captioning into a semantic-concept-guided set prediction (SCG-SP) problem by fitting the predicted caption set to the ground-truth set, where the set-level relationship is fully captured. Specifically, our set prediction consists of two synergistic tasks, i.e., caption generation and an auxiliary task of concept combination prediction providing extra semantic supervision. Each caption in the set is attached to a concept combination indicating the primary semantic content of the caption and facilitating element alignment in set prediction. Furthermore, we apply a diversity regularization term on concepts to encourage the model to generate semantically diverse captions with various concept combinations. These two tasks share multiple semantics-specific encodings as input, which are obtained by iterative interaction between visual features and conceptual queries. The correspondence between the generated captions and specific concept combinations further guarantees the interpretability of our model. Extensive experiments on benchmark datasets show that the proposed SCG-SP achieves state-of-the-art (SOTA) performance under both relevance and diversity metrics.

* aaai 2024 accepted

Via

Access Paper or Ask Questions

Adaptive Proximal Policy Optimization with Upper Confidence Bound

Dec 12, 2023

Ziqi Zhang, Jingzehua Xu, Zifeng Zhuang, Jinxin Liu, Donglin wang

Figure 1 for Adaptive Proximal Policy Optimization with Upper Confidence Bound

Figure 2 for Adaptive Proximal Policy Optimization with Upper Confidence Bound

Figure 3 for Adaptive Proximal Policy Optimization with Upper Confidence Bound

Figure 4 for Adaptive Proximal Policy Optimization with Upper Confidence Bound

Abstract:Trust Region Policy Optimization (TRPO) attractively optimizes the policy while constraining the update of the new policy within a trust region, ensuring the stability and monotonic optimization. Building on the theoretical guarantees of trust region optimization, Proximal Policy Optimization (PPO) successfully enhances the algorithm's sample efficiency and reduces deployment complexity by confining the update of the new and old policies within a surrogate trust region. However, this approach is limited by the fixed setting of surrogate trust region and is not sufficiently adaptive, because there is no theoretical proof that the optimal clipping bound remains consistent throughout the entire training process, truncating the ratio of the new and old policies within surrogate trust region can ensure that the algorithm achieves its best performance, therefore, exploring and researching a dynamic clip bound for improving PPO's performance can be quite beneficial. To design an adaptive clipped trust region and explore the dynamic clip bound's impact on the performance of PPO, we introduce an adaptive PPO-CLIP (Adaptive-PPO) method that dynamically explores and exploits the clip bound using a bandit during the online training process. Furthermore, ample experiments will initially demonstrate that our Adaptive-PPO exhibits sample efficiency and performance compared to PPO-CLIP.

Via

Access Paper or Ask Questions

No Privacy Left Outside: On the Security of TEE-Shielded DNN Partition for On-Device ML

Oct 11, 2023

Ziqi Zhang, Chen Gong, Yifeng Cai, Yuanyuan Yuan, Bingyan Liu, Ding Li, Yao Guo, Xiangqun Chen

Figure 1 for No Privacy Left Outside: On the Security of TEE-Shielded DNN Partition for On-Device ML

Figure 2 for No Privacy Left Outside: On the Security of TEE-Shielded DNN Partition for On-Device ML

Figure 3 for No Privacy Left Outside: On the Security of TEE-Shielded DNN Partition for On-Device ML

Figure 4 for No Privacy Left Outside: On the Security of TEE-Shielded DNN Partition for On-Device ML

Abstract:On-device ML introduces new security challenges: DNN models become white-box accessible to device users. Based on white-box information, adversaries can conduct effective model stealing (MS) and membership inference attack (MIA). Using Trusted Execution Environments (TEEs) to shield on-device DNN models aims to downgrade (easy) white-box attacks to (harder) black-box attacks. However, one major shortcoming is the sharply increased latency (up to 50X). To accelerate TEE-shield DNN computation with GPUs, researchers proposed several model partition techniques. These solutions, referred to as TEE-Shielded DNN Partition (TSDP), partition a DNN model into two parts, offloading the privacy-insensitive part to the GPU while shielding the privacy-sensitive part within the TEE. This paper benchmarks existing TSDP solutions using both MS and MIA across a variety of DNN models, datasets, and metrics. We show important findings that existing TSDP solutions are vulnerable to privacy-stealing attacks and are not as safe as commonly believed. We also unveil the inherent difficulty in deciding optimal DNN partition configurations (i.e., the highest security with minimal utility cost) for present TSDP solutions. The experiments show that such ``sweet spot'' configurations vary across datasets and models. Based on lessons harvested from the experiments, we present TEESlice, a novel TSDP method that defends against MS and MIA during DNN inference. TEESlice follows a partition-before-training strategy, which allows for accurate separation between privacy-related weights from public weights. TEESlice delivers the same security protection as shielding the entire DNN model inside TEE (the ``upper-bound'' security guarantees) with over 10X less overhead (in both experimental and real-world environments) than prior TSDP solutions and no accuracy loss.

* Accepted by S&P'24

Via

Access Paper or Ask Questions

Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement Learning

Jun 22, 2023

Jinxin Liu, Ziqi Zhang, Zhenyu Wei, Zifeng Zhuang, Yachen Kang, Sibo Gai, Donglin Wang

Figure 1 for Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement Learning

Figure 2 for Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement Learning

Figure 3 for Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement Learning

Figure 4 for Beyond OOD State Actions: Supported Cross-Domain Offline Reinforcement Learning

Abstract:Offline reinforcement learning (RL) aims to learn a policy using only pre-collected and fixed data. Although avoiding the time-consuming online interactions in RL, it poses challenges for out-of-distribution (OOD) state actions and often suffers from data inefficiency for training. Despite many efforts being devoted to addressing OOD state actions, the latter (data inefficiency) receives little attention in offline RL. To address this, this paper proposes the cross-domain offline RL, which assumes offline data incorporate additional source-domain data from varying transition dynamics (environments), and expects it to contribute to the offline data efficiency. To do so, we identify a new challenge of OOD transition dynamics, beyond the common OOD state actions issue, when utilizing cross-domain offline data. Then, we propose our method BOSA, which employs two support-constrained objectives to address the above OOD issues. Through extensive experiments in the cross-domain offline RL setting, we demonstrate BOSA can greatly improve offline data efficiency: using only 10\% of the target data, BOSA could achieve {74.4\%} of the SOTA offline RL performance that uses 100\% of the target data. Additionally, we also show BOSA can be effortlessly plugged into model-based offline RL and noising data augmentation techniques (used for generating source-domain data), which naturally avoids the potential dynamics mismatch between target-domain data and newly generated source-domain data.

Via

Access Paper or Ask Questions

The Barriers to Online Clothing Websites for Visually Impaired People: An Interview and Observation Approach to Understanding Needs

May 19, 2023

Amnah Alluqmani, Morgan Harvey, Ziqi Zhang

Figure 1 for The Barriers to Online Clothing Websites for Visually Impaired People: An Interview and Observation Approach to Understanding Needs

Figure 2 for The Barriers to Online Clothing Websites for Visually Impaired People: An Interview and Observation Approach to Understanding Needs

Figure 3 for The Barriers to Online Clothing Websites for Visually Impaired People: An Interview and Observation Approach to Understanding Needs

Figure 4 for The Barriers to Online Clothing Websites for Visually Impaired People: An Interview and Observation Approach to Understanding Needs

Abstract:Visually impaired (VI) people often face challenges when performing everyday tasks and identify shopping for clothes as one of the most challenging. Many engage in online shopping, which eliminates some challenges of physical shopping. However, clothes shopping online suffers from many other limitations and barriers. More research is needed to address these challenges, and extant works often base their findings on interviews alone, providing only subjective, recall-biased information. We conducted two complementary studies using both observational and interview approaches to fill a gap in understanding about VI people's behaviour when selecting and purchasing clothes online. Our findings show that shopping websites suffer from inaccurate, misleading, and contradictory clothing descriptions; that VI people mainly rely on (unreliable) search tools and check product descriptions by reviewing customer comments. Our findings also indicate that VI people are hesitant to accept assistance from automated, but that trust in such systems could be improved if researchers can develop systems that better accommodate users' needs and preferences.

Via

Access Paper or Ask Questions

Mining Healthcare Procurement Data Using Text Mining and Natural Language Processing -- Reflection From An Industrial Project

Jan 09, 2023

Ziqi Zhang, Tomas Jasaitis, Richard Freeman, Rowida Alfrjani, Adam Funk

Abstract:While text mining and NLP research has been established for decades, there remain gaps in the literature that reports the use of these techniques in building real-world applications. For example, they typically look at single and sometimes simplified tasks, and do not discuss in-depth data heterogeneity and inconsistency that is common in real-world problems or their implication on the development of their methods. Also, few prior work has focused on the healthcare domain. In this work, we describe an industry project that developed text mining and NLP solutions to mine millions of heterogeneous, multilingual procurement documents in the healthcare sector. We extract structured procurement contract data that is used to power a platform for dynamically assessing supplier risks. Our work makes unique contributions in a number of ways. First, we deal with highly heterogeneous, multilingual data and we document our approach to tackle these challenges. This is mainly based on a method that effectively uses domain knowledge and generalises to multiple text mining and NLP tasks and languages. Second, applying this method to mine millions of procurement documents, we develop the first structured procurement contract database that will help facilitate the tendering process. Second, Finally, we discuss lessons learned for practical text mining/NLP development, and make recommendations for future research and practice.

Via

Access Paper or Ask Questions

Decoupled Mixup for Generalized Visual Recognition

Oct 26, 2022

Haozhe Liu, Wentian Zhang, Jinheng Xie, Haoqian Wu, Bing Li, Ziqi Zhang, Yuexiang Li, Yawen Huang, Bernard Ghanem, Yefeng Zheng

Abstract:Convolutional neural networks (CNN) have demonstrated remarkable performance when the training and testing data are from the same distribution. However, such trained CNN models often largely degrade on testing data which is unseen and Out-Of-the-Distribution (OOD). To address this issue, we propose a novel "Decoupled-Mixup" method to train CNN models for OOD visual recognition. Different from previous work combining pairs of images homogeneously, our method decouples each image into discriminative and noise-prone regions, and then heterogeneously combines these regions of image pairs to train CNN models. Since the observation is that noise-prone regions such as textural and clutter backgrounds are adverse to the generalization ability of CNN models during training, we enhance features from discriminative regions and suppress noise-prone ones when combining an image pair. To further improve the generalization ability of trained models, we propose to disentangle discriminative and noise-prone regions in frequency-based and context-based fashions. Experiment results show the high generalization performance of our method on testing data that are composed of unseen contexts, where our method achieves 85.76\% top-1 accuracy in Track-1 and 79.92\% in Track-2 in the NICO Challenge. The source code is available at https://github.com/HaozheLiu-ST/NICOChallenge-OOD-Classification.

* Accepted by ECCV'2022 Workshop: Causality in Vision

Via

Access Paper or Ask Questions

KSG: Knowledge and Skill Graph

Sep 13, 2022

Feng Zhao, Ziqi Zhang, Donglin Wang

Figure 1 for KSG: Knowledge and Skill Graph

Figure 2 for KSG: Knowledge and Skill Graph

Figure 3 for KSG: Knowledge and Skill Graph

Figure 4 for KSG: Knowledge and Skill Graph

Abstract:The knowledge graph (KG) is an essential form of knowledge representation that has grown in prominence in recent years. Because it concentrates on nominal entities and their relationships, traditional knowledge graphs are static and encyclopedic in nature. On this basis, event knowledge graph (Event KG) models the temporal and spatial dynamics by text processing to facilitate downstream applications, such as question-answering, recommendation and intelligent search. Existing KG research, on the other hand, mostly focuses on text processing and static facts, ignoring the vast quantity of dynamic behavioral information included in photos, movies, and pre-trained neural networks. In addition, no effort has been done to include behavioral intelligence information into the knowledge graph for deep reinforcement learning (DRL) and robot learning. In this paper, we propose a novel dynamic knowledge and skill graph (KSG), and then we develop a basic and specific KSG based on CN-DBpedia. The nodes are divided into entity and attribute nodes, with entity nodes containing the agent, environment, and skill (DRL policy or policy representation), and attribute nodes containing the entity description, pre-train network, and offline dataset. KSG can search for different agents' skills in various environments and provide transferable information for acquiring new skills. This is the first study that we are aware of that looks into dynamic KSG for skill retrieval and learning. Extensive experimental results on new skill learning show that KSG boosts new skill learning efficiency.

* 5 pages, 7 figures, published to CIKM 2022

Via

Access Paper or Ask Questions