Picture for Xueguang Ma

Xueguang Ma

Fixing Data That Hurts Performance: Cascading LLMs to Relabel Hard Negatives for Robust Information Retrieval

Add code
May 22, 2025
Viaarxiv icon

General-Reasoner: Advancing LLM Reasoning Across All Domains

Add code
May 21, 2025
Viaarxiv icon

Tevatron 2.0: Unified Document Retrieval Toolkit across Scale, Language, and Modality

Add code
May 05, 2025
Viaarxiv icon

ScholarCopilot: Training Large Language Models for Academic Writing with Accurate Citations

Add code
Apr 03, 2025
Viaarxiv icon

Rank-R1: Enhancing Reasoning in LLM-based Document Rerankers via Reinforcement Learning

Add code
Mar 08, 2025
Viaarxiv icon

DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers

Add code
Feb 25, 2025
Viaarxiv icon

PixelWorld: Towards Perceiving Everything as Pixels

Add code
Jan 31, 2025
Viaarxiv icon

Document Screenshot Retrievers are Vulnerable to Pixel Poisoning Attacks

Add code
Jan 28, 2025
Viaarxiv icon

VISA: Retrieval Augmented Generation with Visual Source Attribution

Add code
Dec 19, 2024
Viaarxiv icon

LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Add code
Jun 21, 2024
Viaarxiv icon