Picture for Jianhao Yan

Jianhao Yan

Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning

Add code
Feb 12, 2026
Viaarxiv icon

Detecting RLVR Training Data via Structural Convergence of Reasoning

Add code
Feb 12, 2026
Viaarxiv icon

P1-VL: Bridging Visual Perception and Scientific Reasoning in Physics Olympiads

Add code
Feb 10, 2026
Viaarxiv icon

BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search

Add code
Jan 16, 2026
Viaarxiv icon

Learning to Reason under Off-Policy Guidance

Add code
Apr 22, 2025
Viaarxiv icon

A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Add code
Mar 27, 2025
Viaarxiv icon

RefuteBench 2.0 -- Agentic Benchmark for Dynamic Evaluation of LLM Responses to Refutation Instruction

Add code
Feb 25, 2025
Figure 1 for RefuteBench 2.0 -- Agentic Benchmark for Dynamic Evaluation of LLM Responses to Refutation Instruction
Figure 2 for RefuteBench 2.0 -- Agentic Benchmark for Dynamic Evaluation of LLM Responses to Refutation Instruction
Figure 3 for RefuteBench 2.0 -- Agentic Benchmark for Dynamic Evaluation of LLM Responses to Refutation Instruction
Figure 4 for RefuteBench 2.0 -- Agentic Benchmark for Dynamic Evaluation of LLM Responses to Refutation Instruction
Viaarxiv icon

Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing

Add code
Feb 21, 2025
Figure 1 for Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing
Figure 2 for Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing
Figure 3 for Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing
Figure 4 for Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing
Viaarxiv icon

Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels

Add code
Nov 21, 2024
Figure 1 for Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels
Figure 2 for Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels
Figure 3 for Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels
Figure 4 for Benchmarking GPT-4 against Human Translators: A Comprehensive Evaluation Across Languages, Domains, and Expertise Levels
Viaarxiv icon

ELICIT: LLM Augmentation via External In-Context Capability

Add code
Oct 12, 2024
Figure 1 for ELICIT: LLM Augmentation via External In-Context Capability
Figure 2 for ELICIT: LLM Augmentation via External In-Context Capability
Figure 3 for ELICIT: LLM Augmentation via External In-Context Capability
Figure 4 for ELICIT: LLM Augmentation via External In-Context Capability
Viaarxiv icon