Picture for Yiqin Yang

Yiqin Yang

SC2Arena and StarEvolve: Benchmark and Self-Improvement Framework for LLMs in Complex Decision-Making Tasks

Add code
Aug 14, 2025
Viaarxiv icon

DPMT: Dual Process Multi-scale Theory of Mind Framework for Real-time Human-AI Collaboration

Add code
Jul 18, 2025
Viaarxiv icon

Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset

Add code
Feb 26, 2025
Figure 1 for Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
Figure 2 for Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
Figure 3 for Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
Figure 4 for Fewer May Be Better: Enhancing Offline Reinforcement Learning with Reduced Dataset
Viaarxiv icon

Episodic Novelty Through Temporal Distance

Add code
Jan 26, 2025
Viaarxiv icon

S-EPOA: Overcoming the Indivisibility of Annotations with Skill-Driven Preference-Based Reinforcement Learning

Add code
Aug 22, 2024
Figure 1 for S-EPOA: Overcoming the Indivisibility of Annotations with Skill-Driven Preference-Based Reinforcement Learning
Figure 2 for S-EPOA: Overcoming the Indivisibility of Annotations with Skill-Driven Preference-Based Reinforcement Learning
Figure 3 for S-EPOA: Overcoming the Indivisibility of Annotations with Skill-Driven Preference-Based Reinforcement Learning
Figure 4 for S-EPOA: Overcoming the Indivisibility of Annotations with Skill-Driven Preference-Based Reinforcement Learning
Viaarxiv icon

Bayesian Design Principles for Offline-to-Online Reinforcement Learning

Add code
May 31, 2024
Viaarxiv icon

No Prior Mask: Eliminate Redundant Action for Deep Reinforcement Learning

Add code
Dec 11, 2023
Figure 1 for No Prior Mask: Eliminate Redundant Action for Deep Reinforcement Learning
Figure 2 for No Prior Mask: Eliminate Redundant Action for Deep Reinforcement Learning
Figure 3 for No Prior Mask: Eliminate Redundant Action for Deep Reinforcement Learning
Figure 4 for No Prior Mask: Eliminate Redundant Action for Deep Reinforcement Learning
Viaarxiv icon

Unsupervised Behavior Extraction via Random Intent Priors

Add code
Oct 28, 2023
Viaarxiv icon

Learning Diverse Risk Preferences in Population-based Self-play

Add code
May 19, 2023
Figure 1 for Learning Diverse Risk Preferences in Population-based Self-play
Figure 2 for Learning Diverse Risk Preferences in Population-based Self-play
Figure 3 for Learning Diverse Risk Preferences in Population-based Self-play
Figure 4 for Learning Diverse Risk Preferences in Population-based Self-play
Viaarxiv icon

The Provable Benefits of Unsupervised Data Sharing for Offline Reinforcement Learning

Add code
Feb 27, 2023
Viaarxiv icon