Picture for Jiajun Chai

Jiajun Chai

Promoting Efficient Reasoning with Verifiable Stepwise Reward

Add code
Aug 14, 2025
Viaarxiv icon

SRFT: A Single-Stage Method with Supervised and Reinforcement Fine-Tuning for Reasoning

Add code
Jun 24, 2025
Viaarxiv icon

DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy

Add code
Jun 11, 2025
Viaarxiv icon

A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat

Add code
Dec 05, 2022
Figure 1 for A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat
Figure 2 for A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat
Figure 3 for A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat
Figure 4 for A Hierarchical Deep Reinforcement Learning Framework for 6-DOF UCAV Air-to-Air Combat
Viaarxiv icon