Document Embedding


Evo-Retriever: LLM-Guided Curriculum Evolution with Viewpoint-Pathway Collaboration for Multimodal Document Retrieval

Add code
Mar 17, 2026
Viaarxiv icon

CWoMP: Morpheme Representation Learning for Interlinear Glossing

Add code
Mar 18, 2026
Viaarxiv icon

KidsNanny: A Two-Stage Multimodal Content Moderation Pipeline Integrating Visual Classification, Object Detection, OCR, and Contextual Reasoning for Child Safety

Add code
Mar 17, 2026
Viaarxiv icon

Automatic Inter-document Multi-hop Scientific QA Generation

Add code
Mar 15, 2026
Viaarxiv icon

Negation is Not Semantic: Diagnosing Dense Retrieval Failure Modes for Trade-offs in Contradiction-Aware Biomedical QA

Add code
Mar 18, 2026
Viaarxiv icon

You Told Me to Do It: Measuring Instructional Text-induced Private Data Leakage in LLM Agents

Add code
Mar 12, 2026
Viaarxiv icon

Towards Unsupervised Adversarial Document Detection in Retrieval Augmented Generation Systems

Add code
Mar 17, 2026
Viaarxiv icon

How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition

Add code
Mar 16, 2026
Viaarxiv icon

OPERA: Online Data Pruning for Efficient Retrieval Model Adaptation

Add code
Mar 17, 2026
Viaarxiv icon

NanoVDR: Distilling a 2B Vision-Language Retriever into a 70M Text-Only Encoder for Visual Document Retrieval

Add code
Mar 13, 2026
Viaarxiv icon