Picture for Yisong Yue

Yisong Yue

California Institute of Technology

Evaluating Agentic Optimization on Large Codebases

Add code
Mar 16, 2026
Viaarxiv icon

Actor-Curator: Co-adaptive Curriculum Learning via Policy-Improvement Bandits for RL Post-Training

Add code
Feb 24, 2026
Viaarxiv icon

Krause Synchronization Transformers

Add code
Feb 12, 2026
Viaarxiv icon

NitroGen: An Open Foundation Model for Generalist Gaming Agents

Add code
Jan 04, 2026
Viaarxiv icon

Embodied Learning of Reward for Musculoskeletal Control with Vision Language Models

Add code
Dec 28, 2025
Viaarxiv icon

Feedforward 3D Editing via Text-Steerable Image-to-3D

Add code
Dec 15, 2025
Viaarxiv icon

Learning Time-Scale Invariant Population-Level Neural Representations

Add code
Nov 17, 2025
Figure 1 for Learning Time-Scale Invariant Population-Level Neural Representations
Figure 2 for Learning Time-Scale Invariant Population-Level Neural Representations
Figure 3 for Learning Time-Scale Invariant Population-Level Neural Representations
Figure 4 for Learning Time-Scale Invariant Population-Level Neural Representations
Viaarxiv icon

A Narwhal-Inspired Sensing-to-Control Framework for Small Fixed-Wing Aircraft

Add code
Oct 08, 2025
Figure 1 for A Narwhal-Inspired Sensing-to-Control Framework for Small Fixed-Wing Aircraft
Figure 2 for A Narwhal-Inspired Sensing-to-Control Framework for Small Fixed-Wing Aircraft
Figure 3 for A Narwhal-Inspired Sensing-to-Control Framework for Small Fixed-Wing Aircraft
Figure 4 for A Narwhal-Inspired Sensing-to-Control Framework for Small Fixed-Wing Aircraft
Viaarxiv icon

Kuramoto Orientation Diffusion Models

Add code
Sep 18, 2025
Viaarxiv icon

TDRM: Smooth Reward Models with Temporal Difference for LLM RL and Inference

Add code
Sep 18, 2025
Figure 1 for TDRM: Smooth Reward Models with Temporal Difference for LLM RL and Inference
Figure 2 for TDRM: Smooth Reward Models with Temporal Difference for LLM RL and Inference
Figure 3 for TDRM: Smooth Reward Models with Temporal Difference for LLM RL and Inference
Figure 4 for TDRM: Smooth Reward Models with Temporal Difference for LLM RL and Inference
Viaarxiv icon