Text


LeapAlign: Post-Training Flow Matching Models at Any Generation Step by Building Two-Step Trajectories

Add code
Apr 16, 2026
Viaarxiv icon

MADE: A Living Benchmark for Multi-Label Text Classification with Uncertainty Quantification of Medical Device Adverse Events

Add code
Apr 16, 2026
Viaarxiv icon

IUQ: Interrogative Uncertainty Quantification for Long-Form Large Language Model Generation

Add code
Apr 16, 2026
Viaarxiv icon

From Procedural Skills to Strategy Genes: Towards Experience-Driven Test-Time Evolution

Add code
Apr 16, 2026
Viaarxiv icon

From Reactive to Proactive: Assessing the Proactivity of Voice Agents via ProVoice-Bench

Add code
Apr 16, 2026
Viaarxiv icon

RaTA-Tool: Retrieval-based Tool Selection with Multimodal Large Language Models

Add code
Apr 16, 2026
Viaarxiv icon

Text2Arch: A Dataset for Generating Scientific Architecture Diagrams from Natural Language Descriptions

Add code
Apr 16, 2026
Viaarxiv icon

IE as Cache: Information Extraction Enhanced Agentic Reasoning

Add code
Apr 16, 2026
Viaarxiv icon

Domain Fine-Tuning FinBERT on Finnish Histopathological Reports: Train-Time Signals and Downstream Correlations

Add code
Apr 16, 2026
Viaarxiv icon

Knowing When Not to Answer: Evaluating Abstention in Multimodal Reasoning Systems

Add code
Apr 16, 2026
Viaarxiv icon