Picture for Baobao Chang

Baobao Chang

BabyVision: Visual Reasoning Beyond Language

Add code
Jan 10, 2026
Viaarxiv icon

FaithLens: Detecting and Explaining Faithfulness Hallucination

Add code
Dec 23, 2025
Figure 1 for FaithLens: Detecting and Explaining Faithfulness Hallucination
Figure 2 for FaithLens: Detecting and Explaining Faithfulness Hallucination
Figure 3 for FaithLens: Detecting and Explaining Faithfulness Hallucination
Figure 4 for FaithLens: Detecting and Explaining Faithfulness Hallucination
Viaarxiv icon

Teaching Large Language Models to Maintain Contextual Faithfulness via Synthetic Tasks and Reinforcement Learning

Add code
May 22, 2025
Viaarxiv icon

G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning

Add code
May 19, 2025
Figure 1 for G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
Figure 2 for G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
Figure 3 for G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
Figure 4 for G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning
Viaarxiv icon

Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think

Add code
Feb 27, 2025
Viaarxiv icon

Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering

Add code
Feb 11, 2025
Viaarxiv icon

UltraIF: Advancing Instruction Following from the Wild

Add code
Feb 06, 2025
Figure 1 for UltraIF: Advancing Instruction Following from the Wild
Figure 2 for UltraIF: Advancing Instruction Following from the Wild
Figure 3 for UltraIF: Advancing Instruction Following from the Wild
Figure 4 for UltraIF: Advancing Instruction Following from the Wild
Viaarxiv icon

Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey

Add code
Dec 30, 2024
Figure 1 for Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Figure 2 for Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Figure 3 for Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Figure 4 for Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey
Viaarxiv icon

Looking Beyond Text: Reducing Language bias in Large Vision-Language Models via Multimodal Dual-Attention and Soft-Image Guidance

Add code
Nov 21, 2024
Viaarxiv icon

Selecting Influential Samples for Long Context Alignment via Homologous Models' Guidance and Contextual Awareness Measurement

Add code
Oct 21, 2024
Viaarxiv icon