Picture for Jiajun Song

Jiajun Song

MARCH: Multi-Agent Reinforced Self-Check for LLM Hallucination

Add code
Mar 25, 2026
Viaarxiv icon

CLIPO: Contrastive Learning in Policy Optimization Generalizes RLVR

Add code
Mar 10, 2026
Viaarxiv icon

A Unified Representation Underlying the Judgment of Large Language Models

Add code
Oct 31, 2025
Viaarxiv icon

VARMA-Enhanced Transformer for Time Series Forecasting

Add code
Sep 05, 2025
Figure 1 for VARMA-Enhanced Transformer for Time Series Forecasting
Figure 2 for VARMA-Enhanced Transformer for Time Series Forecasting
Figure 3 for VARMA-Enhanced Transformer for Time Series Forecasting
Figure 4 for VARMA-Enhanced Transformer for Time Series Forecasting
Viaarxiv icon

SalientFusion: Context-Aware Compositional Zero-Shot Food Recognition

Add code
Sep 04, 2025
Viaarxiv icon

Mind the Gap: The Divergence Between Human and LLM-Generated Tasks

Add code
Aug 01, 2025
Viaarxiv icon

ToM-RL: Reinforcement Learning Unlocks Theory of Mind in Small LLMs

Add code
Apr 02, 2025
Viaarxiv icon

OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning

Add code
Dec 31, 2024
Figure 1 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Figure 2 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Figure 3 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Figure 4 for OCRBench v2: An Improved Benchmark for Evaluating Large Multimodal Models on Visual Text Localization and Reasoning
Viaarxiv icon

Proposing and solving olympiad geometry with guided tree search

Add code
Dec 14, 2024
Viaarxiv icon

GATE OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation

Add code
Dec 01, 2024
Viaarxiv icon