Picture for Nan Jiang

Nan Jiang

Faculty of Information Technology, Beijing University of Technology, Beijing, China, Beijing Key Laboratory of Trusted Computing, Beijing, China, National Engineering Laboratory for Critical Technologies of Information Security Classified Protection, Beijing, China

Thinking vs. Doing: Agents that Reason by Scaling Test-Time Interaction

Add code
Jun 09, 2025
Viaarxiv icon

A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning

Add code
May 25, 2025
Viaarxiv icon

From Questions to Clinical Recommendations: Large Language Models Driving Evidence-Based Clinical Decision Making

Add code
May 15, 2025
Viaarxiv icon

Joint Graph Convolution and Sequential Modeling for Scalable Network Traffic Estimation

Add code
May 12, 2025
Viaarxiv icon

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Add code
May 05, 2025
Viaarxiv icon

Generative Auto-Bidding with Value-Guided Explorations

Add code
Apr 20, 2025
Viaarxiv icon

A Minimalist Approach to LLM Reasoning: from Rejection Sampling to Reinforce

Add code
Apr 15, 2025
Viaarxiv icon

Context-Aware Adaptive Sampling for Intelligent Data Acquisition Systems Using DQN

Add code
Apr 12, 2025
Viaarxiv icon

Improving Harmful Text Detection with Joint Retrieval and External Knowledge

Add code
Apr 03, 2025
Viaarxiv icon

Is Best-of-N the Best of Them? Coverage, Scaling, and Optimality in Inference-Time Alignment

Add code
Mar 27, 2025
Viaarxiv icon