Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Lina Yao

Context-based and Diversity-driven Specificity in Compositional Zero-Shot Learning

Feb 27, 2024

Yun Li, Zhe Liu, Hang Chen, Lina Yao

Abstract:Compositional Zero-Shot Learning (CZSL) aims to recognize unseen attribute-object pairs based on a limited set of observed examples. Current CZSL methodologies, despite their advancements, tend to neglect the distinct specificity levels present in attributes. For instance, given images of sliced strawberries, they may fail to prioritize `Sliced-Strawberry' over a generic `Red-Strawberry', despite the former being more informative. They also suffer from ballooning search space when shifting from Close-World (CW) to Open-World (OW) CZSL. To address the issues, we introduce the Context-based and Diversity-driven Specificity learning framework for CZSL (CDS-CZSL). Our framework evaluates the specificity of attributes by considering the diversity of objects they apply to and their related context. This novel approach allows for more accurate predictions by emphasizing specific attribute-object pairs and improves composition filtering in OW-CZSL. We conduct experiments in both CW and OW scenarios, and our model achieves state-of-the-art results across three datasets.

Via

Access Paper or Ask Questions

Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models

Feb 23, 2024

Yuzhe Zhang, Yipeng Zhang, Yidong Gan, Lina Yao, Chen Wang

Figure 1 for Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models

Figure 2 for Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models

Figure 3 for Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models

Figure 4 for Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models

Abstract:Causal graph recovery is essential in the field of causal inference. Traditional methods are typically knowledge-based or statistical estimation-based, which are limited by data collection biases and individuals' knowledge about factors affecting the relations between variables of interests. The advance of large language models (LLMs) provides opportunities to address these problems. We propose a novel method that utilizes the extensive knowledge contained within a large corpus of scientific literature to deduce causal relationships in general causal graph recovery tasks. This method leverages Retrieval Augmented-Generation (RAG) based LLMs to systematically analyze and extract pertinent information from a comprehensive collection of research papers. Our method first retrieves relevant text chunks from the aggregated literature. Then, the LLM is tasked with identifying and labelling potential associations between factors. Finally, we give a method to aggregate the associational relationships to build a causal graph. We demonstrate our method is able to construct high quality causal graphs on the well-known SACHS dataset solely from literature.

Via

Access Paper or Ask Questions

MatchNAS: Optimizing Edge AI in Sparse-Label Data Contexts via Automating Deep Neural Network Porting for Mobile Deployment

Feb 21, 2024

Hongtao Huang, Xiaojun Chang, Wen Hu, Lina Yao

Figure 1 for MatchNAS: Optimizing Edge AI in Sparse-Label Data Contexts via Automating Deep Neural Network Porting for Mobile Deployment

Figure 2 for MatchNAS: Optimizing Edge AI in Sparse-Label Data Contexts via Automating Deep Neural Network Porting for Mobile Deployment

Figure 3 for MatchNAS: Optimizing Edge AI in Sparse-Label Data Contexts via Automating Deep Neural Network Porting for Mobile Deployment

Figure 4 for MatchNAS: Optimizing Edge AI in Sparse-Label Data Contexts via Automating Deep Neural Network Porting for Mobile Deployment

Abstract:Recent years have seen the explosion of edge intelligence with powerful Deep Neural Networks (DNNs). One popular scheme is training DNNs on powerful cloud servers and subsequently porting them to mobile devices after being lightweight. Conventional approaches manually specialized DNNs for various edge platforms and retrain them with real-world data. However, as the number of platforms increases, these approaches become labour-intensive and computationally prohibitive. Additionally, real-world data tends to be sparse-label, further increasing the difficulty of lightweight models. In this paper, we propose MatchNAS, a novel scheme for porting DNNs to mobile devices. Specifically, we simultaneously optimise a large network family using both labelled and unlabelled data and then automatically search for tailored networks for different hardware platforms. MatchNAS acts as an intermediary that bridges the gap between cloud-based DNNs and edge-based DNNs.

Via

Access Paper or Ask Questions

Foundation Models for Recommender Systems: A Survey and New Perspectives

Feb 17, 2024

Chengkai Huang, Tong Yu, Kaige Xie, Shuai Zhang, Lina Yao, Julian McAuley

Figure 1 for Foundation Models for Recommender Systems: A Survey and New Perspectives

Figure 2 for Foundation Models for Recommender Systems: A Survey and New Perspectives

Figure 3 for Foundation Models for Recommender Systems: A Survey and New Perspectives

Figure 4 for Foundation Models for Recommender Systems: A Survey and New Perspectives

Abstract:Recently, Foundation Models (FMs), with their extensive knowledge bases and complex architectures, have offered unique opportunities within the realm of recommender systems (RSs). In this paper, we attempt to thoroughly examine FM-based recommendation systems (FM4RecSys). We start by reviewing the research background of FM4RecSys. Then, we provide a systematic taxonomy of existing FM4RecSys research works, which can be divided into four different parts including data characteristics, representation learning, model type, and downstream tasks. Within each part, we review the key recent research developments, outlining the representative models and discussing their characteristics. Moreover, we elaborate on the open problems and opportunities of FM4RecSys aiming to shed light on future research directions in this area. In conclusion, we recap our findings and discuss the emerging trends in this field.

Via

Access Paper or Ask Questions

Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model

Feb 16, 2024

Xiangyu Zhang, Daijiao Liu, Hexin Liu, Qiquan Zhang, Hanyu Meng, Leibny Paola Garcia, Eng Siong Chng, Lina Yao

Figure 1 for Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model

Figure 2 for Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model

Figure 3 for Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model

Figure 4 for Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model

Abstract:Recently, Denoising Diffusion Probabilistic Models (DDPMs) have attained leading performances across a diverse range of generative tasks. However, in the field of speech synthesis, although DDPMs exhibit impressive performance, their long training duration and substantial inference costs hinder practical deployment. Existing approaches primarily focus on enhancing inference speed, while approaches to accelerate training a key factor in the costs associated with adding or customizing voices often necessitate complex modifications to the model, compromising their universal applicability. To address the aforementioned challenges, we propose an inquiry: is it possible to enhance the training/inference speed and performance of DDPMs by modifying the speech signal itself? In this paper, we double the training and inference speed of Speech DDPMs by simply redirecting the generative target to the wavelet domain. This method not only achieves comparable or superior performance to the original model in speech synthesis tasks but also demonstrates its versatility. By investigating and utilizing different wavelet bases, our approach proves effective not just in speech synthesis, but also in speech enhancement.

Via

Access Paper or Ask Questions

Review-Incorporated Model-Agnostic Profile Injection Attacks on Recommender Systems

Feb 14, 2024

Shiyi Yang, Lina Yao, Chen Wang, Xiwei Xu, Liming Zhu

Abstract:Recent studies have shown that recommender systems (RSs) are highly vulnerable to data poisoning attacks. Understanding attack tactics helps improve the robustness of RSs. We intend to develop efficient attack methods that use limited resources to generate high-quality fake user profiles to achieve 1) transferability among black-box RSs 2) and imperceptibility among detectors. In order to achieve these goals, we introduce textual reviews of products to enhance the generation quality of the profiles. Specifically, we propose a novel attack framework named R-Trojan, which formulates the attack objectives as an optimization problem and adopts a tailored transformer-based generative adversarial network (GAN) to solve it so that high-quality attack profiles can be produced. Comprehensive experiments on real-world datasets demonstrate that R-Trojan greatly outperforms state-of-the-art attack methods on various victim RSs under black-box settings and show its good imperceptibility.

* Accepted by ICDM 2023

Via

Access Paper or Ask Questions

Learning with Mixture of Prototypes for Out-of-Distribution Detection

Feb 05, 2024

Haodong Lu, Dong Gong, Shuo Wang, Jason Xue, Lina Yao, Kristen Moore

Figure 1 for Learning with Mixture of Prototypes for Out-of-Distribution Detection

Figure 2 for Learning with Mixture of Prototypes for Out-of-Distribution Detection

Figure 3 for Learning with Mixture of Prototypes for Out-of-Distribution Detection

Figure 4 for Learning with Mixture of Prototypes for Out-of-Distribution Detection

Abstract:Out-of-distribution (OOD) detection aims to detect testing samples far away from the in-distribution (ID) training data, which is crucial for the safe deployment of machine learning models in the real world. Distance-based OOD detection methods have emerged with enhanced deep representation learning. They identify unseen OOD samples by measuring their distances from ID class centroids or prototypes. However, existing approaches learn the representation relying on oversimplified data assumptions, e.g, modeling ID data of each class with one centroid class prototype or using loss functions not designed for OOD detection, which overlook the natural diversities within the data. Naively enforcing data samples of each class to be compact around only one prototype leads to inadequate modeling of realistic data and limited performance. To tackle these issues, we propose PrototypicAl Learning with a Mixture of prototypes (PALM) which models each class with multiple prototypes to capture the sample diversities, and learns more faithful and compact samples embeddings to enhance OOD detection. Our method automatically identifies and dynamically updates prototypes, assigning each sample to a subset of prototypes via reciprocal neighbor soft assignment weights. PALM optimizes a maximum likelihood estimation (MLE) loss to encourage the sample embeddings to be compact around the associated prototypes, as well as a contrastive loss on all prototypes to enhance intra-class compactness and inter-class discrimination at the prototype level. Moreover, the automatic estimation of prototypes enables our approach to be extended to the challenging OOD detection task with unlabelled ID data. Extensive experiments demonstrate the superiority of PALM, achieving state-of-the-art average AUROC performance of 93.82 on the challenging CIFAR-100 benchmark. Code is available at https://github.com/jeff024/PALM.

* Accepted at ICLR 2024

Via

Access Paper or Ask Questions

HCVP: Leveraging Hierarchical Contrastive Visual Prompt for Domain Generalization

Jan 18, 2024

Guanglin Zhou, Zhongyi Han, Shiming Chen, Biwei Huang, Liming Zhu, Tongliang Liu, Lina Yao, Kun Zhang

Figure 1 for HCVP: Leveraging Hierarchical Contrastive Visual Prompt for Domain Generalization

Figure 2 for HCVP: Leveraging Hierarchical Contrastive Visual Prompt for Domain Generalization

Figure 3 for HCVP: Leveraging Hierarchical Contrastive Visual Prompt for Domain Generalization

Figure 4 for HCVP: Leveraging Hierarchical Contrastive Visual Prompt for Domain Generalization

Abstract:Domain Generalization (DG) endeavors to create machine learning models that excel in unseen scenarios by learning invariant features. In DG, the prevalent practice of constraining models to a fixed structure or uniform parameterization to encapsulate invariant features can inadvertently blend specific aspects. Such an approach struggles with nuanced differentiation of inter-domain variations and may exhibit bias towards certain domains, hindering the precise learning of domain-invariant features. Recognizing this, we introduce a novel method designed to supplement the model with domain-level and task-specific characteristics. This approach aims to guide the model in more effectively separating invariant features from specific characteristics, thereby boosting the generalization. Building on the emerging trend of visual prompts in the DG paradigm, our work introduces the novel \textbf{H}ierarchical \textbf{C}ontrastive \textbf{V}isual \textbf{P}rompt (HCVP) methodology. This represents a significant advancement in the field, setting itself apart with a unique generative approach to prompts, alongside an explicit model structure and specialized loss functions. Differing from traditional visual prompts that are often shared across entire datasets, HCVP utilizes a hierarchical prompt generation network enhanced by prompt contrastive learning. These generative prompts are instance-dependent, catering to the unique characteristics inherent to different domains and tasks. Additionally, we devise a prompt modulation network that serves as a bridge, effectively incorporating the generated visual prompts into the vision transformer backbone. Experiments conducted on five DG datasets demonstrate the effectiveness of HCVP, outperforming both established DG algorithms and adaptation protocols.

Via

Access Paper or Ask Questions

SCALA: Sparsification-based Contrastive Learning for Anomaly Detection on Attributed Networks

Jan 08, 2024

Enbo He, Yitong Hao, Yue Zhang, Guisheng Yin, Lina Yao

Abstract:Anomaly detection on attributed networks aims to find the nodes whose behaviors are significantly different from other majority nodes. Generally, network data contains information about relationships between entities, and the anomaly is usually embodied in these relationships. Therefore, how to comprehensively model complex interaction patterns in networks is still a major focus. It can be observed that anomalies in networks violate the homophily assumption. However, most existing studies only considered this phenomenon obliquely rather than explicitly. Besides, the node representation of normal entities can be perturbed easily by the noise relationships introduced by anomalous nodes. To address the above issues, we present a novel contrastive learning framework for anomaly detection on attributed networks, \textbf{SCALA}, aiming to improve the embedding quality of the network and provide a new measurement of qualifying the anomaly score for each node by introducing sparsification into the conventional method. Extensive experiments are conducted on five benchmark real-world datasets and the results show that SCALA consistently outperforms all baseline methods significantly.

* 9 pages, 14 figures

Via

Access Paper or Ask Questions

Attention-aware Social Graph Transformer Networks for Stochastic Trajectory Prediction

Dec 26, 2023

Yao Liu, Binghao Li, Xianzhi Wang, Claude Sammut, Lina Yao

Figure 1 for Attention-aware Social Graph Transformer Networks for Stochastic Trajectory Prediction

Figure 2 for Attention-aware Social Graph Transformer Networks for Stochastic Trajectory Prediction

Figure 3 for Attention-aware Social Graph Transformer Networks for Stochastic Trajectory Prediction

Figure 4 for Attention-aware Social Graph Transformer Networks for Stochastic Trajectory Prediction

Abstract:Trajectory prediction is fundamental to various intelligent technologies, such as autonomous driving and robotics. The motion prediction of pedestrians and vehicles helps emergency braking, reduces collisions, and improves traffic safety. Current trajectory prediction research faces problems of complex social interactions, high dynamics and multi-modality. Especially, it still has limitations in long-time prediction. We propose Attention-aware Social Graph Transformer Networks for multi-modal trajectory prediction. We combine Graph Convolutional Networks and Transformer Networks by generating stable resolution pseudo-images from Spatio-temporal graphs through a designed stacking and interception method. Furthermore, we design the attention-aware module to handle social interaction information in scenarios involving mixed pedestrian-vehicle traffic. Thus, we maintain the advantages of the Graph and Transformer, i.e., the ability to aggregate information over an arbitrary number of neighbors and the ability to perform complex time-dependent data processing. We conduct experiments on datasets involving pedestrian, vehicle, and mixed trajectories, respectively. Our results demonstrate that our model minimizes displacement errors across various metrics and significantly reduces the likelihood of collisions. It is worth noting that our model effectively reduces the final displacement error, illustrating the ability of our model to predict for a long time.

* 14 pages, 9 figures, 6 tables

Via

Access Paper or Ask Questions