Picture for Bingxiang He

Bingxiang He

May

Veri-R1: Toward Precise and Faithful Claim Verification via Online Reinforcement Learning

Add code
Oct 02, 2025
Viaarxiv icon

A Survey of Reinforcement Learning for Large Reasoning Models

Add code
Sep 10, 2025
Viaarxiv icon

MiniCPM4: Ultra-Efficient LLMs on End Devices

Add code
Jun 09, 2025
Viaarxiv icon

AIR: A Systematic Analysis of Annotations, Instructions, and Response Pairs in Preference Dataset

Add code
Apr 04, 2025
Viaarxiv icon

Process Reinforcement through Implicit Rewards

Add code
Feb 03, 2025
Figure 1 for Process Reinforcement through Implicit Rewards
Figure 2 for Process Reinforcement through Implicit Rewards
Figure 3 for Process Reinforcement through Implicit Rewards
Figure 4 for Process Reinforcement through Implicit Rewards
Viaarxiv icon

EscapeBench: Pushing Language Models to Think Outside the Box

Add code
Dec 18, 2024
Viaarxiv icon

Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity

Add code
Jun 17, 2024
Figure 1 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Figure 2 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Figure 3 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Figure 4 for Zero-Shot Generalization during Instruction Tuning: Insights from Similarity and Granularity
Viaarxiv icon

Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents

Add code
Feb 15, 2024
Viaarxiv icon

A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks

Add code
Jun 17, 2022
Figure 1 for A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks
Figure 2 for A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks
Figure 3 for A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks
Figure 4 for A Unified Evaluation of Textual Backdoor Learning: Frameworks and Benchmarks
Viaarxiv icon