Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Qiang Lyu

Divide and Conquer: Decoupled Representation Alignment for Multimodal World Models

May 03, 2026

Junyuan Xiao, Dingkang Liang, Xin Zhou, Yixuan Ye, Tongtong Su, Guangmo Yi, Bin Xia, Qiang Lyu, Shurui Shi, Jun Huang(+2 more)

Abstract:Emerging multi-modal world models attempt to jointly generate videos across diverse modalities (e.g., RGB, depth, and mask), yet they fail to fully exploit the rich priors of existing foundation models. We propose $M^2$-REPA, the first representation alignment method tailored for multi-modal video generation. Our key insight is that foundation models trained on different modality spaces naturally capture distinct domain-specific priors, acting as complementary "experts." Specifically, we first decouple modality-specific features from the diffusion model's intermediate representations, then align each with its corresponding expert foundation model. To this end, we design two synergistic objectives: a multi-modal representation alignment loss that enforces feature-to-expert matching, and a modality-specific decoupling regularization that encourages complementarity across different modalities. This design enables joint optimization, fully exploiting priors from multiple foundation models. Extensive experiments demonstrate that our method significantly outperforms baselines in visual quality and long-term consistency.

* Preprint. 26 pages, 7 figures, with supplementary material

Via

Access Paper or Ask Questions

HiFi-Inpaint: Towards High-Fidelity Reference-Based Inpainting for Generating Detail-Preserving Human-Product Images

Mar 03, 2026

Yichen Liu, Donghao Zhou, Jie Wang, Xin Gao, Guisheng Liu, Jiatong Li, Quanwei Zhang, Qiang Lyu, Lanqing Guo, Shilei Wen(+2 more)

Abstract:Human-product images, which showcase the integration of humans and products, play a vital role in advertising, e-commerce, and digital marketing. The essential challenge of generating such images lies in ensuring the high-fidelity preservation of product details. Among existing paradigms, reference-based inpainting offers a targeted solution by leveraging product reference images to guide the inpainting process. However, limitations remain in three key aspects: the lack of diverse large-scale training data, the struggle of current models to focus on product detail preservation, and the inability of coarse supervision for achieving precise guidance. To address these issues, we propose HiFi-Inpaint, a novel high-fidelity reference-based inpainting framework tailored for generating human-product images. HiFi-Inpaint introduces Shared Enhancement Attention (SEA) to refine fine-grained product features and Detail-Aware Loss (DAL) to enforce precise pixel-level supervision using high-frequency maps. Additionally, we construct a new dataset, HP-Image-40K, with samples curated from self-synthesis data and processed with automatic filtering. Experimental results show that HiFi-Inpaint achieves state-of-the-art performance, delivering detail-preserving human-product images.

* Accepted by CVPR 2026 (Project page: https://correr-zhou.github.io/HiFi-Inpaint/)

Via

Access Paper or Ask Questions

Compositional Prototypical Networks for Few-Shot Classification

Jun 11, 2023

Qiang Lyu, Weiqiang Wang

Abstract:It is assumed that pre-training provides the feature extractor with strong class transferability and that high novel class generalization can be achieved by simply reusing the transferable feature extractor. In this work, our motivation is to explicitly learn some fine-grained and transferable meta-knowledge so that feature reusability can be further improved. Concretely, inspired by the fact that humans can use learned concepts or components to help them recognize novel classes, we propose Compositional Prototypical Networks (CPN) to learn a transferable prototype for each human-annotated attribute, which we call a component prototype. We empirically demonstrate that the learned component prototypes have good class transferability and can be reused to construct compositional prototypes for novel classes. Then a learnable weight generator is utilized to adaptively fuse the compositional and visual prototypes. Extensive experiments demonstrate that our method can achieve state-of-the-art results on different datasets and settings. The performance gains are especially remarkable in the 5-way 1-shot setting. The code is available at https://github.com/fikry102/CPN.

* Accepted by AAAI 2023

Via

Access Paper or Ask Questions

Discovering indicators of dark horse of soccer games by deep learning from sequential trading data

Aug 04, 2020

Liyao Lu, Qiang Lyu

Figure 1 for Discovering indicators of dark horse of soccer games by deep learning from sequential trading data

Figure 2 for Discovering indicators of dark horse of soccer games by deep learning from sequential trading data

Figure 3 for Discovering indicators of dark horse of soccer games by deep learning from sequential trading data

Figure 4 for Discovering indicators of dark horse of soccer games by deep learning from sequential trading data

Abstract:It is not surprise for machine learning models to provide decent prediction accuracy of soccer games outcomes based on various objective metrics. However, the performance is not that decent in terms of predicting difficult and valuable matches. A deep learning model is designed and trained on a real sequential trading data from the real prediction market, with the assumption that such trading data contain critical latent information to determine the game outcomes. A new loss function is proposed which biases the selection toward matches with high investment return to train our model. Full investigation of 4669 top soccer league matches showed that our model traded off prediction accuracy for high value return due to a certain ability to detect dark horses. A further try is conducted to depict some indicators discovered by our model for describing key features of big dark horses and regular hot horses.

Via

Access Paper or Ask Questions

Extracting Actionability from Machine Learning Models by Sub-optimal Deterministic Planning

Nov 03, 2016

Qiang Lyu, Yixin Chen, Zhaorong Li, Zhicheng Cui, Ling Chen, Xing Zhang, Haihua Shen

Figure 1 for Extracting Actionability from Machine Learning Models by Sub-optimal Deterministic Planning

Figure 2 for Extracting Actionability from Machine Learning Models by Sub-optimal Deterministic Planning

Figure 3 for Extracting Actionability from Machine Learning Models by Sub-optimal Deterministic Planning

Figure 4 for Extracting Actionability from Machine Learning Models by Sub-optimal Deterministic Planning

Abstract:A main focus of machine learning research has been improving the generalization accuracy and efficiency of prediction models. Many models such as SVM, random forest, and deep neural nets have been proposed and achieved great success. However, what emerges as missing in many applications is actionability, i.e., the ability to turn prediction results into actions. For example, in applications such as customer relationship management, clinical prediction, and advertisement, the users need not only accurate prediction, but also actionable instructions which can transfer an input to a desirable goal (e.g., higher profit repays, lower morbidity rates, higher ads hit rates). Existing effort in deriving such actionable knowledge is few and limited to simple action models which restricted to only change one attribute for each action. The dilemma is that in many real applications those action models are often more complex and harder to extract an optimal solution. In this paper, we propose a novel approach that achieves actionability by combining learning with planning, two core areas of AI. In particular, we propose a framework to extract actionable knowledge from random forest, one of the most widely used and best off-the-shelf classifiers. We formulate the actionability problem to a sub-optimal action planning (SOAP) problem, which is to find a plan to alter certain features of a given input so that the random forest would yield a desirable output, while minimizing the total costs of actions. Technically, the SOAP problem is formulated in the SAS+ planning formalism, and solved using a Max-SAT based approach. Our experimental results demonstrate the effectiveness and efficiency of the proposed approach on a personal credit dataset and other benchmarks. Our work represents a new application of automated planning on an emerging and challenging machine learning paradigm.

* 16 pages, 4 figures

Via

Access Paper or Ask Questions