Picture for Gang Xiong

Gang Xiong

Scaling Offline Model-Based RL via Jointly-Optimized World-Action Model Pretraining

Add code
Oct 01, 2024
Viaarxiv icon

T2VIndexer: A Generative Video Indexer for Efficient Text-Video Retrieval

Add code
Aug 21, 2024
Viaarxiv icon

IIU: Independent Inference Units for Knowledge-based Visual Question Answering

Add code
Aug 15, 2024
Viaarxiv icon

Visual-Semantic Decomposition and Partial Alignment for Document-based Zero-Shot Learning

Add code
Jul 23, 2024
Viaarxiv icon

SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models

Add code
Mar 20, 2024
Viaarxiv icon

RIME: Robust Preference-based Reinforcement Learning with Noisy Preferences

Add code
Mar 12, 2024
Viaarxiv icon

Watermarking Vision-Language Pre-trained Models for Multi-modal Embedding as a Service

Add code
Nov 10, 2023
Viaarxiv icon

Align before Search: Aligning Ads Image to Text for Accurate Cross-Modal Sponsored Search

Add code
Sep 28, 2023
Viaarxiv icon

Context-I2W: Mapping Images to Context-dependent Words for Accurate Zero-Shot Composed Image Retrieval

Add code
Sep 28, 2023
Viaarxiv icon

Evaluate Geometry of Radiance Field with Low-frequency Color Prior

Add code
Apr 10, 2023
Viaarxiv icon