Picture for Shota Takashiro

Shota Takashiro

On Advantage Estimates for Max@K Policy Gradients

Add code
Jun 04, 2026
Viaarxiv icon

OrderGrad: Optimizing Beyond the Mean with Order-Statistic Policy Gradient Estimation

Add code
Jun 04, 2026
Viaarxiv icon

Thinking While Listening: Fast-Slow Recurrence for Long-Horizon Sequential Modeling

Add code
Apr 02, 2026
Viaarxiv icon

$\infty$-MoE: Generalizing Mixture of Experts to Infinite Experts

Add code
Jan 25, 2026
Viaarxiv icon

Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge Unlearning

Add code
Oct 01, 2024
Figure 1 for Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge Unlearning
Figure 2 for Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge Unlearning
Figure 3 for Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge Unlearning
Figure 4 for Answer When Needed, Forget When Not: Language Models Pretend to Forget via In-Context Knowledge Unlearning
Viaarxiv icon