Picture for Xuelong Li

Xuelong Li

Riemannian Optimization on Relaxed Indicator Matrix Manifold

Add code
Mar 26, 2025
Viaarxiv icon

Optimal Transport Adapter Tuning for Bridging Modality Gaps in Few-Shot Remote Sensing Scene Classification

Add code
Mar 19, 2025
Figure 1 for Optimal Transport Adapter Tuning for Bridging Modality Gaps in Few-Shot Remote Sensing Scene Classification
Figure 2 for Optimal Transport Adapter Tuning for Bridging Modality Gaps in Few-Shot Remote Sensing Scene Classification
Figure 3 for Optimal Transport Adapter Tuning for Bridging Modality Gaps in Few-Shot Remote Sensing Scene Classification
Figure 4 for Optimal Transport Adapter Tuning for Bridging Modality Gaps in Few-Shot Remote Sensing Scene Classification
Viaarxiv icon

Task-Oriented Feature Compression for Multimodal Understanding via Device-Edge Co-Inference

Add code
Mar 17, 2025
Viaarxiv icon

Towards Learnable Anchor for Deep Multi-View Clustering

Add code
Mar 16, 2025
Figure 1 for Towards Learnable Anchor for Deep Multi-View Clustering
Figure 2 for Towards Learnable Anchor for Deep Multi-View Clustering
Figure 3 for Towards Learnable Anchor for Deep Multi-View Clustering
Figure 4 for Towards Learnable Anchor for Deep Multi-View Clustering
Viaarxiv icon

MoMa-Kitchen: A 100K+ Benchmark for Affordance-Grounded Last-Mile Navigation in Mobile Manipulation

Add code
Mar 14, 2025
Viaarxiv icon

Bidirectional Prototype-Reward co-Evolution for Test-Time Adaptation of Vision-Language Models

Add code
Mar 12, 2025
Viaarxiv icon

NFIG: Autoregressive Image Generation with Next-Frequency Prediction

Add code
Mar 10, 2025
Figure 1 for NFIG: Autoregressive Image Generation with Next-Frequency Prediction
Figure 2 for NFIG: Autoregressive Image Generation with Next-Frequency Prediction
Figure 3 for NFIG: Autoregressive Image Generation with Next-Frequency Prediction
Figure 4 for NFIG: Autoregressive Image Generation with Next-Frequency Prediction
Viaarxiv icon

From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models

Add code
Mar 08, 2025
Figure 1 for From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Figure 2 for From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Figure 3 for From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Figure 4 for From Captions to Rewards (CAREVL): Leveraging Large Language Model Experts for Enhanced Reward Modeling in Large Vision-Language Models
Viaarxiv icon

Underlying Semantic Diffusion for Effective and Efficient In-Context Learning

Add code
Mar 06, 2025
Figure 1 for Underlying Semantic Diffusion for Effective and Efficient In-Context Learning
Figure 2 for Underlying Semantic Diffusion for Effective and Efficient In-Context Learning
Figure 3 for Underlying Semantic Diffusion for Effective and Efficient In-Context Learning
Figure 4 for Underlying Semantic Diffusion for Effective and Efficient In-Context Learning
Viaarxiv icon

DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model

Add code
Feb 26, 2025
Figure 1 for DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model
Figure 2 for DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model
Figure 3 for DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model
Figure 4 for DualSpec: Text-to-spatial-audio Generation via Dual-Spectrogram Guided Diffusion Model
Viaarxiv icon