Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zhanxuan Hu

OTFusion: Bridging Vision-only and Vision-Language Models via Optimal Transport for Transductive Zero-Shot Learning

Jun 16, 2025

Qiyu Xu, Wenyang Chen, Zhanxuan Hu, Huafeng Li, Yonghang Tai

Abstract:Transductive zero-shot learning (ZSL) aims to classify unseen categories by leveraging both semantic class descriptions and the distribution of unlabeled test data. While Vision-Language Models (VLMs) such as CLIP excel at aligning visual inputs with textual semantics, they often rely too heavily on class-level priors and fail to capture fine-grained visual cues. In contrast, Vision-only Foundation Models (VFMs) like DINOv2 provide rich perceptual features but lack semantic alignment. To exploit the complementary strengths of these models, we propose OTFusion, a simple yet effective training-free framework that bridges VLMs and VFMs via Optimal Transport. Specifically, OTFusion aims to learn a shared probabilistic representation that aligns visual and semantic information by minimizing the transport cost between their respective distributions. This unified distribution enables coherent class predictions that are both semantically meaningful and visually grounded. Extensive experiments on 11 benchmark datasets demonstrate that OTFusion consistently outperforms the original CLIP model, achieving an average accuracy improvement of nearly $10\%$, all without any fine-tuning or additional annotations. The code will be publicly released after the paper is accepted.

Via

Access Paper or Ask Questions

An Iteratively Re-weighted Method for Problems with Sparsity-Inducing Norms

Jul 02, 2019

Feiping Nie, Zhanxuan Hu, Xiaoqian Wang, Rong Wang, Xuelong Li, Heng Huang

Figure 1 for An Iteratively Re-weighted Method for Problems with Sparsity-Inducing Norms

Figure 2 for An Iteratively Re-weighted Method for Problems with Sparsity-Inducing Norms

Figure 3 for An Iteratively Re-weighted Method for Problems with Sparsity-Inducing Norms

Figure 4 for An Iteratively Re-weighted Method for Problems with Sparsity-Inducing Norms

Abstract:This work aims at solving the problems with intractable sparsity-inducing norms that are often encountered in various machine learning tasks, such as multi-task learning, subspace clustering, feature selection, robust principal component analysis, and so on. Specifically, an Iteratively Re-Weighted method (IRW) with solid convergence guarantee is provided. We investigate its convergence speed via numerous experiments on real data. Furthermore, in order to validate the practicality of IRW, we use it to solve a concrete robust feature selection model with complicated objective function. The experimental results show that the model coupled with proposed optimization method outperforms alternative methods significantly.

* 11 pages, 3 figures

Via

Access Paper or Ask Questions

A Comprehensive Survey for Low Rank Regularization

Sep 14, 2018

Zhanxuan Hu, Feiping Nie, Lai Tian, Rong Wang, Xuelong Li

Figure 1 for A Comprehensive Survey for Low Rank Regularization

Figure 2 for A Comprehensive Survey for Low Rank Regularization

Figure 3 for A Comprehensive Survey for Low Rank Regularization

Figure 4 for A Comprehensive Survey for Low Rank Regularization

Abstract:Low rank regularization, in essence, involves introducing a low rank or approximately low rank assumption for matrix we aim to learn, which has achieved great success in many fields including machine learning, data mining and computer version. Over the last decade, much progress has been made in theories and practical applications. Nevertheless, the intersection between them is very slight. In order to construct a bridge between practical applications and theoretical research, in this paper we provide a comprehensive survey for low rank regularization. We first review several traditional machine learning models using low rank regularization, and then show their (or their variants) applications in solving practical issues, such as non-rigid structure from motion and image denoising. Subsequently, we summarize the regularizers and optimization methods that achieve great success in traditional machine learning tasks but are rarely seen in solving practical issues. Finally, we provide a discussion and comparison for some representative regularizers including convex and non-convex relaxations. Extensive experimental results demonstrate that non-convex regularizers can provide a large advantage over the nuclear norm, the regularizer widely used in solving practical issues.

* 16 pages,4 figures,4 tables

Via

Access Paper or Ask Questions