Picture for Takeshi Kojima

Takeshi Kojima

OrderGrad: Optimizing Beyond the Mean with Order-Statistic Policy Gradient Estimation

Add code
Jun 04, 2026
Viaarxiv icon

On Advantage Estimates for Max@K Policy Gradients

Add code
Jun 04, 2026
Viaarxiv icon

Clustered Self-Assessment: A Simple yet Effective Method for Uncertainty Quantification in Large Language Models

Add code
Jun 02, 2026
Viaarxiv icon

Zipping the Thought: When and How Compressed Reasoning Data Works in LLM Post-Training

Add code
May 27, 2026
Viaarxiv icon

Semantic Token Clustering for Efficient Uncertainty Quantification in Large Language Models

Add code
Mar 20, 2026
Viaarxiv icon

ClinDet-Bench: Beyond Abstention, Evaluating Judgment Determinability of LLMs in Clinical Decision-Making

Add code
Feb 26, 2026
Viaarxiv icon

Emergent Analogical Reasoning in Transformers

Add code
Feb 03, 2026
Viaarxiv icon

$\infty$-MoE: Generalizing Mixture of Experts to Infinite Experts

Add code
Jan 25, 2026
Viaarxiv icon

Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties

Add code
Jun 06, 2025
Viaarxiv icon

Inconsistent Tokenizations Cause Language Models to be Perplexed by Japanese Grammar

Add code
May 26, 2025
Viaarxiv icon