Picture for Jiahao Xu

Jiahao Xu

DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning

Add code
May 29, 2025
Viaarxiv icon

Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training

Add code
May 20, 2025
Viaarxiv icon

Optimal Client Sampling in Federated Learning with Client-Level Heterogeneous Differential Privacy

Add code
May 19, 2025
Viaarxiv icon

Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards

Add code
May 19, 2025
Viaarxiv icon

Traceable Black-box Watermarks for Federated Learning

Add code
May 19, 2025
Viaarxiv icon

DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning

Add code
Apr 15, 2025
Viaarxiv icon

Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique

Add code
Mar 21, 2025
Viaarxiv icon

RaSA: Rank-Sharing Low-Rank Adaptation

Add code
Mar 16, 2025
Viaarxiv icon

Detecting Backdoor Attacks in Federated Learning via Direction Alignment Inspection

Add code
Mar 11, 2025
Viaarxiv icon

The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models

Add code
Mar 04, 2025
Viaarxiv icon