Picture for Xiaojun Chang

Xiaojun Chang

Ground-R1: Incentivizing Grounded Visual Reasoning via Reinforcement Learning

Add code
May 26, 2025
Viaarxiv icon

Semantic-enhanced Co-attention Prompt Learning for Non-overlapping Cross-Domain Recommendation

Add code
May 25, 2025
Viaarxiv icon

CoNav: Collaborative Cross-Modal Reasoning for Embodied Navigation

Add code
May 22, 2025
Viaarxiv icon

Token-Level Prompt Mixture with Parameter-Free Routing for Federated Domain Generalization

Add code
Apr 29, 2025
Viaarxiv icon

MDK12-Bench: A Multi-Discipline Benchmark for Evaluating Reasoning in Multimodal Large Language Models

Add code
Apr 08, 2025
Viaarxiv icon

Learning A Zero-shot Occupancy Network from Vision Foundation Models via Self-supervised Adaptation

Add code
Mar 10, 2025
Viaarxiv icon

Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration

Add code
Dec 17, 2024
Figure 1 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 2 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 3 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Figure 4 for Dense Audio-Visual Event Localization under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration
Viaarxiv icon

HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation

Add code
Dec 15, 2024
Figure 1 for HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation
Figure 2 for HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation
Figure 3 for HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation
Figure 4 for HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation
Viaarxiv icon

RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation

Add code
Dec 11, 2024
Viaarxiv icon

GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation

Add code
Dec 01, 2024
Viaarxiv icon