Picture for Xing Sun

Xing Sun

Zooming from Context to Cue: Hierarchical Preference Optimization for Multi-Image MLLMs

Add code
May 28, 2025
Viaarxiv icon

TACO: Think-Answer Consistency for Optimized Long-Chain Reasoning and Efficient Data Learning via Reinforcement Learning in LVLMs

Add code
May 27, 2025
Viaarxiv icon

VITA-Audio: Fast Interleaved Cross-Modal Token Generation for Efficient Large Speech-Language Model

Add code
May 06, 2025
Viaarxiv icon

Sequential-NIAH: A Needle-In-A-Haystack Benchmark for Extracting Sequential Needles from Long Contexts

Add code
Apr 09, 2025
Viaarxiv icon

FactGuard: Leveraging Multi-Agent Systems to Generate Answerable and Unanswerable Questions for Enhanced Long-Context LLM Extraction

Add code
Apr 08, 2025
Viaarxiv icon

LLaVA-RadZ: Can Multimodal Large Language Models Effectively Tackle Zero-shot Radiology Recognition?

Add code
Mar 10, 2025
Viaarxiv icon

Human Cognition Inspired RAG with Knowledge Graph for Complex Problem Solving

Add code
Mar 09, 2025
Viaarxiv icon

RocketEval: Efficient Automated LLM Evaluation via Grading Checklist

Add code
Mar 07, 2025
Viaarxiv icon

FlowAgent: Achieving Compliance and Flexibility for Workflow Agents

Add code
Feb 20, 2025
Viaarxiv icon

RoleMRC: A Fine-Grained Composite Benchmark for Role-Playing and Instruction-Following

Add code
Feb 17, 2025
Viaarxiv icon