Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiaolong Liu

Huazhong University of Science and Technology

JoyType: A Robust Design for Multilingual Visual Text Creation

Sep 26, 2024

Chao Li, Chen Jiang, Xiaolong Liu, Jun Zhao, Guoxin Wang

Figure 1 for JoyType: A Robust Design for Multilingual Visual Text Creation

Figure 2 for JoyType: A Robust Design for Multilingual Visual Text Creation

Figure 3 for JoyType: A Robust Design for Multilingual Visual Text Creation

Figure 4 for JoyType: A Robust Design for Multilingual Visual Text Creation

Abstract:Generating images with accurately represented text, especially in non-Latin languages, poses a significant challenge for diffusion models. Existing approaches, such as the integration of hint condition diagrams via auxiliary networks (e.g., ControlNet), have made strides towards addressing this issue. However, diffusion models often fall short in tasks requiring controlled text generation, such as specifying particular fonts or producing text in small fonts. In this paper, we introduce a novel approach for multilingual visual text creation, named JoyType, designed to maintain the font style of text during the image generation process. Our methodology begins with assembling a training dataset, JoyType-1M, comprising 1 million pairs of data. Each pair includes an image, its description, and glyph instructions corresponding to the font style within the image. We then developed a text control network, Font ControlNet, tasked with extracting font style information to steer the image generation. To further enhance our model's ability to maintain font style, notably in generating small-font text, we incorporated a multi-layer OCR-aware loss into the diffusion process. This enhancement allows JoyType to direct text rendering using low-level descriptors. Our evaluations, based on both visual and accuracy metrics, demonstrate that JoyType significantly outperforms existing state-of-the-art methods. Additionally, JoyType can function as a plugin, facilitating the creation of varied image styles in conjunction with other stable diffusion models on HuggingFace and CivitAI. Our project is open-sourced on https://jdh-algo.github.io/JoyType/.

* Under Review at AAAI 2025

Via

Access Paper or Ask Questions

Contrastive Disentangling: Fine-grained representation learning through multi-level contrastive learning without class priors

Sep 07, 2024

Houwang Jiang, Zhuxian Liu, Guodong Liu, Xiaolong Liu, Shihua Zhan

Abstract:Recent advancements in unsupervised representation learning often leverage class information to enhance feature extraction and clustering performance. However, this reliance on class priors limits the applicability of such methods in real-world scenarios where class information is unavailable or ambiguous. In this paper, we propose Contrastive Disentangling (CD), a simple and effective framework that learns representations without any reliance on class priors. Our framework employs a multi-level contrastive learning strategy that combines instance-level and feature-level losses with a normalized entropy loss to learn semantically rich and fine-grained representations. Specifically, (1) the instance-level contrastive loss encourages the separation of feature representations for different samples, (2) the feature-level contrastive loss promotes independence among the feature head predictions, and (3) the normalized entropy loss encourages the feature heads to capture meaningful and prevalent attributes from the data. These components work together to enable CD to significantly outperform existing methods, as demonstrated by extensive experiments on benchmark datasets including CIFAR-10, CIFAR-100, STL-10, and ImageNet-10, particularly in scenarios where class priors are absent. The code is available at https://github.com/Hoper-J/Contrastive-Disentangling.

Via

Access Paper or Ask Questions

Instruction-based Hypergraph Pretraining

Mar 28, 2024

Mingdai Yang, Zhiwei Liu, Liangwei Yang, Xiaolong Liu, Chen Wang, Hao Peng, Philip S. Yu

Figure 1 for Instruction-based Hypergraph Pretraining

Figure 2 for Instruction-based Hypergraph Pretraining

Figure 3 for Instruction-based Hypergraph Pretraining

Figure 4 for Instruction-based Hypergraph Pretraining

Abstract:Pretraining has been widely explored to augment the adaptability of graph learning models to transfer knowledge from large datasets to a downstream task, such as link prediction or classification. However, the gap between training objectives and the discrepancy between data distributions in pretraining and downstream tasks hinders the transfer of the pretrained knowledge. Inspired by instruction-based prompts widely used in pretrained language models, we introduce instructions into graph pretraining. In this paper, we propose a novel pretraining framework named Instruction-based Hypergraph Pretraining. To overcome the discrepancy between pretraining and downstream tasks, text-based instructions are applied to provide explicit guidance on specific tasks for representation learning. Compared to learnable prompts, whose effectiveness depends on the quality and the diversity of training data, text-based instructions intrinsically encapsulate task information and support the model to generalize beyond the structure seen during pretraining. To capture high-order relations with task information in a context-aware manner, a novel prompting hypergraph convolution layer is devised to integrate instructions into information propagation in hypergraphs. Extensive experiments conducted on three public datasets verify the superiority of IHP in various scenarios.

* Accepted by SIGIR'24

Via

Access Paper or Ask Questions

Augmentation-Free Dense Contrastive Knowledge Distillation for Efficient Semantic Segmentation

Dec 07, 2023

Jiawei Fan, Chao Li, Xiaolong Liu, Meina Song, Anbang Yao

Figure 1 for Augmentation-Free Dense Contrastive Knowledge Distillation for Efficient Semantic Segmentation

Figure 2 for Augmentation-Free Dense Contrastive Knowledge Distillation for Efficient Semantic Segmentation

Figure 3 for Augmentation-Free Dense Contrastive Knowledge Distillation for Efficient Semantic Segmentation

Figure 4 for Augmentation-Free Dense Contrastive Knowledge Distillation for Efficient Semantic Segmentation

Abstract:In recent years, knowledge distillation methods based on contrastive learning have achieved promising results on image classification and object detection tasks. However, in this line of research, we note that less attention is paid to semantic segmentation. Existing methods heavily rely on data augmentation and memory buffer, which entail high computational resource demands when applying them to handle semantic segmentation that requires to preserve high-resolution feature maps for making dense pixel-wise predictions. In order to address this problem, we present Augmentation-free Dense Contrastive Knowledge Distillation (Af-DCD), a new contrastive distillation learning paradigm to train compact and accurate deep neural networks for semantic segmentation applications. Af-DCD leverages a masked feature mimicking strategy, and formulates a novel contrastive learning loss via taking advantage of tactful feature partitions across both channel and spatial dimensions, allowing to effectively transfer dense and structured local knowledge learnt by the teacher model to a target student model while maintaining training efficiency. Extensive experiments on five mainstream benchmarks with various teacher-student network pairs demonstrate the effectiveness of our approach. For instance, the DeepLabV3-Res18|DeepLabV3-MBV2 model trained by Af-DCD reaches 77.03%|76.38% mIOU on Cityscapes dataset when choosing DeepLabV3-Res101 as the teacher, setting new performance records. Besides that, Af-DCD achieves an absolute mIOU improvement of 3.26%|3.04%|2.75%|2.30%|1.42% compared with individually trained counterpart on Cityscapes|Pascal VOC|Camvid|ADE20K|COCO-Stuff-164K. Code is available at https://github.com/OSVAI/Af-DCD

* The paper of Af-DCD is accepted to NeurIPS 2023. Code and models are available at https://github.com/OSVAI/Af-DCD

Via

Access Paper or Ask Questions

Multi-view Graph Convolution for Participant Recommendation

Nov 20, 2023

Xiaolong Liu, Liangwei Yang, Chen Wang, Mingdai Yang, Zhiwei Liu, Philip S. Yu

Figure 1 for Multi-view Graph Convolution for Participant Recommendation

Figure 2 for Multi-view Graph Convolution for Participant Recommendation

Figure 3 for Multi-view Graph Convolution for Participant Recommendation

Figure 4 for Multi-view Graph Convolution for Participant Recommendation

Abstract:Social networks have become essential for people's lives. The proliferation of web services further expands social networks at an unprecedented scale, leading to immeasurable commercial value for online platforms. Recently, the group buying (GB) business mode is prevalent and also becoming more popular in E-commerce. GB explicitly forms groups of users with similar interests to secure better discounts from the merchants, often operating within social networks. It is a novel way to further unlock the commercial value by explicitly utilizing the online social network in E-commerce. Participant recommendation, a fundamental problem emerging together with GB, aims to find the participants for a launched group buying process with an initiator and a target item to increase the GB success rate. This paper proposes Multi-View Graph Convolution for Participant Recommendation (MVPRec) to tackle this problem. To differentiate the roles of users (Initiator/Participant) within the GB process, we explicitly reconstruct historical GB data into initiator-view and participant-view graphs. Together with the social graph, we obtain a multi-view user representation with graph encoders. Then MVPRec fuses the GB and social representation with an attention module to obtain the user representation and learns a matching score with the initiator's social friends via a multi-head attention mechanism. Social friends with the Top-k matching score are recommended for the corresponding GB process. Experiments on three datasets justify the effectiveness of MVPRec in the emerging participant recommendation problem.

* 10 pages, 5 figures, 2023 IEEE International Conference on Big Data

Via

Access Paper or Ask Questions

Group-Aware Interest Disentangled Dual-Training for Personalized Recommendation

Nov 16, 2023

Xiaolong Liu, Liangwei Yang, Zhiwei Liu, Xiaohan Li, Mingdai Yang, Chen Wang, Philip S. Yu

Figure 1 for Group-Aware Interest Disentangled Dual-Training for Personalized Recommendation

Figure 2 for Group-Aware Interest Disentangled Dual-Training for Personalized Recommendation

Figure 3 for Group-Aware Interest Disentangled Dual-Training for Personalized Recommendation

Figure 4 for Group-Aware Interest Disentangled Dual-Training for Personalized Recommendation

Abstract:Personalized recommender systems aim to predict users' preferences for items. It has become an indispensable part of online services. Online social platforms enable users to form groups based on their common interests. The users' group participation on social platforms reveals their interests and can be utilized as side information to mitigate the data sparsity and cold-start problem in recommender systems. Users join different groups out of different interests. In this paper, we generate group representation from the user's interests and propose IGRec (Interest-based Group enhanced Recommendation) to utilize the group information accurately. It consists of four modules. (1) Interest disentangler via self-gating that disentangles users' interests from their initial embedding representation. (2) Interest aggregator that generates the interest-based group representation by Gumbel-Softmax aggregation on the group members' interests. (3) Interest-based group aggregation that fuses user's representation with the participated group representation. (4) A dual-trained rating prediction module to utilize both user-item and group-item interactions. We conduct extensive experiments on three publicly available datasets. Results show IGRec can effectively alleviate the data sparsity problem and enhance the recommender system with interest-based group representation. Experiments on the group recommendation task further show the informativeness of interest-based group representation.

* 10 pages, 7 figures, 2023 IEEE International Conference on Big Data

Via

Access Paper or Ask Questions

Unified Pretraining for Recommendation via Task Hypergraphs

Oct 20, 2023

Mingdai Yang, Zhiwei Liu, Liangwei Yang, Xiaolong Liu, Chen Wang, Hao Peng, Philip S. Yu

Figure 1 for Unified Pretraining for Recommendation via Task Hypergraphs

Figure 2 for Unified Pretraining for Recommendation via Task Hypergraphs

Figure 3 for Unified Pretraining for Recommendation via Task Hypergraphs

Figure 4 for Unified Pretraining for Recommendation via Task Hypergraphs

Abstract:Although pretraining has garnered significant attention and popularity in recent years, its application in graph-based recommender systems is relatively limited. It is challenging to exploit prior knowledge by pretraining in widely used ID-dependent datasets. On one hand, user-item interaction history in one dataset can hardly be transferred to other datasets through pretraining, where IDs are different. On the other hand, pretraining and finetuning on the same dataset leads to a high risk of overfitting. In this paper, we propose a novel multitask pretraining framework named Unified Pretraining for Recommendation via Task Hypergraphs. For a unified learning pattern to handle diverse requirements and nuances of various pretext tasks, we design task hypergraphs to generalize pretext tasks to hyperedge prediction. A novel transitional attention layer is devised to discriminatively learn the relevance between each pretext task and recommendation. Experimental results on three benchmark datasets verify the superiority of UPRTH. Additional detailed investigations are conducted to demonstrate the effectiveness of the proposed framework.

* Accepted by WSDM 2024

Via

Access Paper or Ask Questions

Knowledge Graph Context-Enhanced Diversified Recommendation

Oct 20, 2023

Xiaolong Liu, Liangwei Yang, Zhiwei Liu, Mingdai Yang, Chen Wang, Hao Peng, Philip S. Yu

Figure 1 for Knowledge Graph Context-Enhanced Diversified Recommendation

Figure 2 for Knowledge Graph Context-Enhanced Diversified Recommendation

Figure 3 for Knowledge Graph Context-Enhanced Diversified Recommendation

Figure 4 for Knowledge Graph Context-Enhanced Diversified Recommendation

Abstract:The field of Recommender Systems (RecSys) has been extensively studied to enhance accuracy by leveraging users' historical interactions. Nonetheless, this persistent pursuit of accuracy frequently engenders diminished diversity, culminating in the well-recognized "echo chamber" phenomenon. Diversified RecSys has emerged as a countermeasure, placing diversity on par with accuracy and garnering noteworthy attention from academic circles and industry practitioners. This research explores the realm of diversified RecSys within the intricate context of knowledge graphs (KG). These KGs act as repositories of interconnected information concerning entities and items, offering a propitious avenue to amplify recommendation diversity through the incorporation of insightful contextual information. Our contributions include introducing an innovative metric, Entity Coverage, and Relation Coverage, which effectively quantifies diversity within the KG domain. Additionally, we introduce the Diversified Embedding Learning (DEL) module, meticulously designed to formulate user representations that possess an innate awareness of diversity. In tandem with this, we introduce a novel technique named Conditional Alignment and Uniformity (CAU). It adeptly encodes KG item embeddings while preserving contextual integrity. Collectively, our contributions signify a substantial stride towards augmenting the panorama of recommendation diversity within the realm of KG-informed RecSys paradigms.

* 10 pages, 5 figures, accepted by WSDM 2024

Via

Access Paper or Ask Questions

Collaborative Contextualization: Bridging the Gap between Collaborative Filtering and Pre-trained Language Model

Oct 13, 2023

Chen Wang, Liangwei Yang, Zhiwei Liu, Xiaolong Liu, Mingdai Yang, Yueqing Liang, Philip S. Yu

Figure 1 for Collaborative Contextualization: Bridging the Gap between Collaborative Filtering and Pre-trained Language Model

Figure 2 for Collaborative Contextualization: Bridging the Gap between Collaborative Filtering and Pre-trained Language Model

Figure 3 for Collaborative Contextualization: Bridging the Gap between Collaborative Filtering and Pre-trained Language Model

Figure 4 for Collaborative Contextualization: Bridging the Gap between Collaborative Filtering and Pre-trained Language Model

Abstract:Traditional recommender systems have heavily relied on identity representations (IDs) to model users and items, while the ascendancy of pre-trained language model (PLM) encoders has enriched the modeling of contextual item descriptions. However, PLMs, although effective in addressing few-shot, zero-shot, or unified modeling scenarios, often neglect the crucial collaborative filtering signal. This neglect gives rise to two pressing challenges: (1) Collaborative Contextualization, the seamless integration of collaborative signals with contextual representations. (2) the imperative to bridge the representation gap between ID-based representations and contextual representations while preserving their contextual semantics. In this paper, we propose CollabContext, a novel model that adeptly combines collaborative filtering signals with contextual representations and aligns these representations within the contextual space, preserving essential contextual semantics. Experimental results across three real-world datasets demonstrate substantial improvements. Leveraging collaborative contextualization, CollabContext can also be effectively applied to cold-start scenarios, achieving remarkable enhancements in recommendation performance. The code is available after the conference accepts the paper.

Via

Access Paper or Ask Questions

Graph-based Alignment and Uniformity for Recommendation

Aug 18, 2023

Liangwei Yang, Zhiwei Liu, Chen Wang, Mingdai Yang, Xiaolong Liu, Jing Ma, Philip S. Yu

Figure 1 for Graph-based Alignment and Uniformity for Recommendation

Figure 2 for Graph-based Alignment and Uniformity for Recommendation

Figure 3 for Graph-based Alignment and Uniformity for Recommendation

Figure 4 for Graph-based Alignment and Uniformity for Recommendation

Abstract:Collaborative filtering-based recommender systems (RecSys) rely on learning representations for users and items to predict preferences accurately. Representation learning on the hypersphere is a promising approach due to its desirable properties, such as alignment and uniformity. However, the sparsity issue arises when it encounters RecSys. To address this issue, we propose a novel approach, graph-based alignment and uniformity (GraphAU), that explicitly considers high-order connectivities in the user-item bipartite graph. GraphAU aligns the user/item embedding to the dense vector representations of high-order neighbors using a neighborhood aggregator, eliminating the need to compute the burdensome alignment to high-order neighborhoods individually. To address the discrepancy in alignment losses, GraphAU includes a layer-wise alignment pooling module to integrate alignment losses layer-wise. Experiments on four datasets show that GraphAU significantly alleviates the sparsity issue and achieves state-of-the-art performance. We open-source GraphAU at https://github.com/YangLiangwei/GraphAU.

* 4 pages

Via

Access Paper or Ask Questions