Picture for Yutaka Matsuo

Yutaka Matsuo

NavWAM: A Navigation World Action Model for Goal-Conditioned Visual Navigation

Add code
Jun 11, 2026
Viaarxiv icon

SMC-ITA: Sequential Monte Carlo Inference-Time Alignment for Video-to-Audio Generation

Add code
Jun 07, 2026
Viaarxiv icon

OrderGrad: Optimizing Beyond the Mean with Order-Statistic Policy Gradient Estimation

Add code
Jun 04, 2026
Viaarxiv icon

On Advantage Estimates for Max@K Policy Gradients

Add code
Jun 04, 2026
Viaarxiv icon

Clustered Self-Assessment: A Simple yet Effective Method for Uncertainty Quantification in Large Language Models

Add code
Jun 02, 2026
Viaarxiv icon

Emergence of Exploration in Policy Gradient Reinforcement Learning via Retrying

Add code
May 29, 2026
Viaarxiv icon

Zipping the Thought: When and How Compressed Reasoning Data Works in LLM Post-Training

Add code
May 27, 2026
Viaarxiv icon

JMed48k: A Multi-Profession Japanese Medical Licensing Benchmark for Vision-Language Model Evaluation

Add code
May 21, 2026
Viaarxiv icon

E3VS-Bench: A Benchmark for Viewpoint-Dependent Active Perception in 3D Gaussian Splatting Scenes

Add code
Apr 20, 2026
Viaarxiv icon

Does "Do Differentiable Simulators Give Better Policy Gradients?'' Give Better Policy Gradients?

Add code
Apr 20, 2026
Viaarxiv icon