Picture for Tianlu Wang

Tianlu Wang

CoT-Self-Instruct: Building high-quality synthetic prompts for reasoning and non-reasoning tasks

Add code
Jul 31, 2025
Viaarxiv icon

Bridging Offline and Online Reinforcement Learning for LLMs

Add code
Jun 26, 2025
Viaarxiv icon

J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning

Add code
May 15, 2025
Viaarxiv icon

Multi-Token Attention

Add code
Apr 01, 2025
Figure 1 for Multi-Token Attention
Figure 2 for Multi-Token Attention
Figure 3 for Multi-Token Attention
Figure 4 for Multi-Token Attention
Viaarxiv icon

Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge

Add code
Jan 30, 2025
Figure 1 for Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge
Figure 2 for Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge
Figure 3 for Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge
Figure 4 for Learning to Plan & Reason for Evaluation with Thinking-LLM-as-a-Judge
Viaarxiv icon

Calibrate to Discriminate: Improve In-Context Learning with Label-Free Comparative Inference

Add code
Oct 03, 2024
Figure 1 for Calibrate to Discriminate: Improve In-Context Learning with Label-Free Comparative Inference
Figure 2 for Calibrate to Discriminate: Improve In-Context Learning with Label-Free Comparative Inference
Figure 3 for Calibrate to Discriminate: Improve In-Context Learning with Label-Free Comparative Inference
Figure 4 for Calibrate to Discriminate: Improve In-Context Learning with Label-Free Comparative Inference
Viaarxiv icon

Self-Taught Evaluators

Add code
Aug 05, 2024
Figure 1 for Self-Taught Evaluators
Figure 2 for Self-Taught Evaluators
Figure 3 for Self-Taught Evaluators
Figure 4 for Self-Taught Evaluators
Viaarxiv icon

Contextual Position Encoding: Learning to Count What's Important

Add code
May 29, 2024
Figure 1 for Contextual Position Encoding: Learning to Count What's Important
Figure 2 for Contextual Position Encoding: Learning to Count What's Important
Figure 3 for Contextual Position Encoding: Learning to Count What's Important
Figure 4 for Contextual Position Encoding: Learning to Count What's Important
Viaarxiv icon

Efficient Tool Use with Chain-of-Abstraction Reasoning

Add code
Jan 30, 2024
Figure 1 for Efficient Tool Use with Chain-of-Abstraction Reasoning
Figure 2 for Efficient Tool Use with Chain-of-Abstraction Reasoning
Figure 3 for Efficient Tool Use with Chain-of-Abstraction Reasoning
Figure 4 for Efficient Tool Use with Chain-of-Abstraction Reasoning
Viaarxiv icon

PathFinder: Guided Search over Multi-Step Reasoning Paths

Add code
Dec 12, 2023
Figure 1 for PathFinder: Guided Search over Multi-Step Reasoning Paths
Figure 2 for PathFinder: Guided Search over Multi-Step Reasoning Paths
Figure 3 for PathFinder: Guided Search over Multi-Step Reasoning Paths
Figure 4 for PathFinder: Guided Search over Multi-Step Reasoning Paths
Viaarxiv icon