Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dong Wang

A Collaborative Transfer Learning Framework for Cross-domain Recommendation

Jun 26, 2023

Wei Zhang, Pengye Zhang, Bo Zhang, Xingxing Wang, Dong Wang

Abstract:In the recommendation systems, there are multiple business domains to meet the diverse interests and needs of users, and the click-through rate(CTR) of each domain can be quite different, which leads to the demand for CTR prediction modeling for different business domains. The industry solution is to use domain-specific models or transfer learning techniques for each domain. The disadvantage of the former is that the data from other domains is not utilized by a single domain model, while the latter leverage all the data from different domains, but the fine-tuned model of transfer learning may trap the model in a local optimum of the source domain, making it difficult to fit the target domain. Meanwhile, significant differences in data quantity and feature schemas between different domains, known as domain shift, may lead to negative transfer in the process of transferring. To overcome these challenges, we propose the Collaborative Cross-Domain Transfer Learning Framework (CCTL). CCTL evaluates the information gain of the source domain on the target domain using a symmetric companion network and adjusts the information transfer weight of each source domain sample using the information flow network. This approach enables full utilization of other domain data while avoiding negative migration. Additionally, a representation enhancement network is used as an auxiliary task to preserve domain-specific features. Comprehensive experiments on both public and real-world industrial datasets, CCTL achieved SOTA score on offline metrics. At the same time, the CCTL algorithm has been deployed in Meituan, bringing 4.37% CTR and 5.43% GMV lift, which is significant to the business.

* KDD2023 accepted

Via

Access Paper or Ask Questions

HSR-Diff:Hyperspectral Image Super-Resolution via Conditional Diffusion Models

Jun 21, 2023

Chanyue Wu, Dong Wang, Hanyu Mao, Ying Li

Figure 1 for HSR-Diff:Hyperspectral Image Super-Resolution via Conditional Diffusion Models

Figure 2 for HSR-Diff:Hyperspectral Image Super-Resolution via Conditional Diffusion Models

Figure 3 for HSR-Diff:Hyperspectral Image Super-Resolution via Conditional Diffusion Models

Figure 4 for HSR-Diff:Hyperspectral Image Super-Resolution via Conditional Diffusion Models

Abstract:Despite the proven significance of hyperspectral images (HSIs) in performing various computer vision tasks, its potential is adversely affected by the low-resolution (LR) property in the spatial domain, resulting from multiple physical factors. Inspired by recent advancements in deep generative models, we propose an HSI Super-resolution (SR) approach with Conditional Diffusion Models (HSR-Diff) that merges a high-resolution (HR) multispectral image (MSI) with the corresponding LR-HSI. HSR-Diff generates an HR-HSI via repeated refinement, in which the HR-HSI is initialized with pure Gaussian noise and iteratively refined. At each iteration, the noise is removed with a Conditional Denoising Transformer (CDF ormer) that is trained on denoising at different noise levels, conditioned on the hierarchical feature maps of HR-MSI and LR-HSI. In addition, a progressive learning strategy is employed to exploit the global information of full-resolution images. Systematic experiments have been conducted on four public datasets, demonstrating that HSR-Diff outperforms state-of-the-art methods.

Via

Access Paper or Ask Questions

Boosting Breast Ultrasound Video Classification by the Guidance of Keyframe Feature Centers

Jun 12, 2023

AnLan Sun, Zhao Zhang, Meng Lei, Yuting Dai, Dong Wang, Liwei Wang

Abstract:Breast ultrasound videos contain richer information than ultrasound images, therefore it is more meaningful to develop video models for this diagnosis task. However, the collection of ultrasound video datasets is much harder. In this paper, we explore the feasibility of enhancing the performance of ultrasound video classification using the static image dataset. To this end, we propose KGA-Net and coherence loss. The KGA-Net adopts both video clips and static images to train the network. The coherence loss uses the feature centers generated by the static images to guide the frame attention in the video model. Our KGA-Net boosts the performance on the public BUSV dataset by a large margin. The visualization results of frame attention prove the explainability of our method. The codes and model weights of our method will be made publicly available.

* Medical Image Computing and Computer-Assisted Intervention 2023

Via

Access Paper or Ask Questions

Graph Based Long-Term And Short-Term Interest Model for Click-Through Rate Prediction

Jun 05, 2023

Huinan Sun, Guangliang Yu, Pengye Zhang, Bo Zhang, Xingxing Wang, Dong Wang

Figure 1 for Graph Based Long-Term And Short-Term Interest Model for Click-Through Rate Prediction

Figure 2 for Graph Based Long-Term And Short-Term Interest Model for Click-Through Rate Prediction

Figure 3 for Graph Based Long-Term And Short-Term Interest Model for Click-Through Rate Prediction

Figure 4 for Graph Based Long-Term And Short-Term Interest Model for Click-Through Rate Prediction

Abstract:Click-through rate (CTR) prediction aims to predict the probability that the user will click an item, which has been one of the key tasks in online recommender and advertising systems. In such systems, rich user behavior (viz. long- and short-term) has been proved to be of great value in capturing user interests. Both industry and academy have paid much attention to this topic and propose different approaches to modeling with long-term and short-term user behavior data. But there are still some unresolved issues. More specially, (1) rule and truncation based methods to extract information from long-term behavior are easy to cause information loss, and (2) single feedback behavior regardless of scenario to extract information from short-term behavior lead to information confusion and noise. To fill this gap, we propose a Graph based Long-term and Short-term interest Model, termed GLSM. It consists of a multi-interest graph structure for capturing long-term user behavior, a multi-scenario heterogeneous sequence model for modeling short-term information, then an adaptive fusion mechanism to fused information from long-term and short-term behaviors. Comprehensive experiments on real-world datasets, GLSM achieved SOTA score on offline metrics. At the same time, the GLSM algorithm has been deployed in our industrial application, bringing 4.9% CTR and 4.3% GMV lift, which is significant to the business.

* CIKM 2022 accepted

Via

Access Paper or Ask Questions

Safe Offline Reinforcement Learning with Real-Time Budget Constraints

Jun 01, 2023

Qian Lin, Bo Tang, Zifan Wu, Chao Yu, Shangqin Mao, Qianlong Xie, Xingxing Wang, Dong Wang

Figure 1 for Safe Offline Reinforcement Learning with Real-Time Budget Constraints

Figure 2 for Safe Offline Reinforcement Learning with Real-Time Budget Constraints

Figure 3 for Safe Offline Reinforcement Learning with Real-Time Budget Constraints

Figure 4 for Safe Offline Reinforcement Learning with Real-Time Budget Constraints

Abstract:Aiming at promoting the safe real-world deployment of Reinforcement Learning (RL), research on safe RL has made significant progress in recent years. However, most existing works in the literature still focus on the online setting where risky violations of the safety budget are likely to be incurred during training. Besides, in many real-world applications, the learned policy is required to respond to dynamically determined safety budgets (i.e., constraint threshold) in real time. In this paper, we target at the above real-time budget constraint problem under the offline setting, and propose Trajectory-based REal-time Budget Inference (TREBI) as a novel solution that approaches this problem from the perspective of trajectory distribution. Theoretically, we prove an error bound of the estimation on the episodic reward and cost under the offline setting and thus provide a performance guarantee for TREBI. Empirical results on a wide range of simulation tasks and a real-world large-scale advertising application demonstrate the capability of TREBI in solving real-time budget constraint problems under offline settings.

* We propose a method to handle the constraint problem with dynamically determined safety budgets under the offline setting

Via

Access Paper or Ask Questions

Mining Negative Temporal Contexts For False Positive Suppression In Real-Time Ultrasound Lesion Detection

May 29, 2023

Haojun Yu, Youcheng Li, QuanLin Wu, Ziwei Zhao, Dengbo Chen, Dong Wang, Liwei Wang

Figure 1 for Mining Negative Temporal Contexts For False Positive Suppression In Real-Time Ultrasound Lesion Detection

Figure 2 for Mining Negative Temporal Contexts For False Positive Suppression In Real-Time Ultrasound Lesion Detection

Figure 3 for Mining Negative Temporal Contexts For False Positive Suppression In Real-Time Ultrasound Lesion Detection

Figure 4 for Mining Negative Temporal Contexts For False Positive Suppression In Real-Time Ultrasound Lesion Detection

Abstract:During ultrasonic scanning processes, real-time lesion detection can assist radiologists in accurate cancer diagnosis. However, this essential task remains challenging and underexplored. General-purpose real-time object detection models can mistakenly report obvious false positives (FPs) when applied to ultrasound videos, potentially misleading junior radiologists. One key issue is their failure to utilize negative symptoms in previous frames, denoted as negative temporal contexts (NTC). To address this issue, we propose to extract contexts from previous frames, including NTC, with the guidance of inverse optical flow. By aggregating extracted contexts, we endow the model with the ability to suppress FPs by leveraging NTC. We call the resulting model UltraDet. The proposed UltraDet demonstrates significant improvement over previous state-of-the-arts and achieves real-time inference speed. To facilitate future research, we will release the code, checkpoints, and high-quality labels of the CVA-BUS dataset used in our experiments.

* 10 pages, 4 figures, MICCAI 2023 Early Accept

Via

Access Paper or Ask Questions

Diffusion Model is an Effective Planner and Data Synthesizer for Multi-Task Reinforcement Learning

May 29, 2023

Haoran He, Chenjia Bai, Kang Xu, Zhuoran Yang, Weinan Zhang, Dong Wang, Bin Zhao, Xuelong Li

Abstract:Diffusion models have demonstrated highly-expressive generative capabilities in vision and NLP. Recent studies in reinforcement learning (RL) have shown that diffusion models are also powerful in modeling complex policies or trajectories in offline datasets. However, these works have been limited to single-task settings where a generalist agent capable of addressing multi-task predicaments is absent. In this paper, we aim to investigate the effectiveness of a single diffusion model in modeling large-scale multi-task offline data, which can be challenging due to diverse and multimodal data distribution. Specifically, we propose Multi-Task Diffusion Model (\textsc{MTDiff}), a diffusion-based method that incorporates Transformer backbones and prompt learning for generative planning and data synthesis in multi-task offline settings. \textsc{MTDiff} leverages vast amounts of knowledge available in multi-task data and performs implicit knowledge sharing among tasks. For generative planning, we find \textsc{MTDiff} outperforms state-of-the-art algorithms across 50 tasks on Meta-World and 8 maps on Maze2D. For data synthesis, \textsc{MTDiff} generates high-quality data for testing tasks given a single demonstration as a prompt, which enhances the low-quality datasets for even unseen tasks.

* 21 pages

Via

Access Paper or Ask Questions

Cross-Domain Policy Adaptation via Value-Guided Data Filtering

May 28, 2023

Kang Xu, Chenjia Bai, Xiaoteng Ma, Dong Wang, Bin Zhao, Zhen Wang, Xuelong Li, Wei Li

Figure 1 for Cross-Domain Policy Adaptation via Value-Guided Data Filtering

Figure 2 for Cross-Domain Policy Adaptation via Value-Guided Data Filtering

Figure 3 for Cross-Domain Policy Adaptation via Value-Guided Data Filtering

Figure 4 for Cross-Domain Policy Adaptation via Value-Guided Data Filtering

Abstract:Generalizing policies across different domains with dynamics mismatch poses a significant challenge in reinforcement learning. For example, a robot learns the policy in a simulator, but when it is deployed in the real world, the dynamics of the environment may be different. Given the source and target domain with dynamics mismatch, we consider the online dynamics adaptation problem, in which case the agent can access sufficient source domain data while online interactions with the target domain are limited. Existing research has attempted to solve the problem from the dynamics discrepancy perspective. In this work, we reveal the limitations of these methods and explore the problem from the value difference perspective via a novel insight on the value consistency across domains. Specifically, we present the Value-Guided Data Filtering (VGDF) algorithm, which selectively shares transitions from the source domain based on the proximity of paired value targets across the two domains. Empirical results on various environments with kinematic and morphology shifts demonstrate that our method achieves superior performance compared to prior approaches.

* 27 pages, 15 figures

Via

Access Paper or Ask Questions

Spot keywords from very noisy and mixed speech

May 28, 2023

Ying Shi, Dong Wang, Lantian Li, Jiqing Han, Shi Yin

Abstract:Most existing keyword spotting research focuses on conditions with slight or moderate noise. In this paper, we try to tackle a more challenging task: detecting keywords buried under strong interfering speech (10 times higher than the keyword in amplitude), and even worse, mixed with other keywords. We propose a novel Mix Training (MT) strategy that encourages the model to discover low-energy keywords from noisy and mixed speech. Experiments were conducted with a vanilla CNN and two EfficientNet (B0/B2) architectures. The results evaluated with the Google Speech Command dataset demonstrated that the proposed mix training approach is highly effective and outperforms standard data augmentation and mixup training.

* Interspeech 2023

Via

Access Paper or Ask Questions

Zero- and Few-Shot Event Detection via Prompt-Based Meta Learning

May 27, 2023

Zhenrui Yue, Huimin Zeng, Mengfei Lan, Heng Ji, Dong Wang

Abstract:With emerging online topics as a source for numerous new events, detecting unseen / rare event types presents an elusive challenge for existing event detection methods, where only limited data access is provided for training. To address the data scarcity problem in event detection, we propose MetaEvent, a meta learning-based framework for zero- and few-shot event detection. Specifically, we sample training tasks from existing event types and perform meta training to search for optimal parameters that quickly adapt to unseen tasks. In our framework, we propose to use the cloze-based prompt and a trigger-aware soft verbalizer to efficiently project output to unseen event types. Moreover, we design a contrastive meta objective based on maximum mean discrepancy (MMD) to learn class-separating features. As such, the proposed MetaEvent can perform zero-shot event detection by mapping features to event types without any prior knowledge. In our experiments, we demonstrate the effectiveness of MetaEvent in both zero-shot and few-shot scenarios, where the proposed method achieves state-of-the-art performance in extensive experiments on benchmark datasets FewEvent and MAVEN.

* Accepted to ACL 2023

Via

Access Paper or Ask Questions