Picture for Xiaobo Xia

Xiaobo Xia

VCM: Vision Concept Modeling Based on Implicit Contrastive Learning with Vision-Language Instruction Fine-Tuning

Add code
Apr 28, 2025
Viaarxiv icon

GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents

Add code
Apr 15, 2025
Viaarxiv icon

Continual Multimodal Contrastive Learning

Add code
Mar 19, 2025
Viaarxiv icon

Identifying Trustworthiness Challenges in Deep Learning Models for Continental-Scale Water Quality Prediction

Add code
Mar 13, 2025
Viaarxiv icon

DreamDPO: Aligning Text-to-3D Generation with Human Preferences via Direct Preference Optimization

Add code
Feb 05, 2025
Viaarxiv icon

Towards Modality Generalization: A Benchmark and Prospective Analysis

Add code
Dec 24, 2024
Figure 1 for Towards Modality Generalization: A Benchmark and Prospective Analysis
Figure 2 for Towards Modality Generalization: A Benchmark and Prospective Analysis
Figure 3 for Towards Modality Generalization: A Benchmark and Prospective Analysis
Figure 4 for Towards Modality Generalization: A Benchmark and Prospective Analysis
Viaarxiv icon

LaVin-DiT: Large Vision Diffusion Transformer

Add code
Nov 18, 2024
Viaarxiv icon

MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct

Add code
Sep 09, 2024
Figure 1 for MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
Figure 2 for MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
Figure 3 for MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
Figure 4 for MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct
Viaarxiv icon

Resultant: Incremental Effectiveness on Likelihood for Unsupervised Out-of-Distribution Detection

Add code
Sep 05, 2024
Viaarxiv icon

Hierarchical Context Pruning: Optimizing Real-World Code Completion with Repository-Level Pretrained Code LLMs

Add code
Jun 27, 2024
Viaarxiv icon