Picture for Taro Yano

Taro Yano

An Empirical Study of LLM-as-a-Judge: How Design Choices Impact Evaluation Reliability

Add code
Jun 16, 2025
Viaarxiv icon

LaMDAgent: An Autonomous Framework for Post-Training Pipeline Optimization via LLM Agents

Add code
May 28, 2025
Viaarxiv icon

Mining Hidden Thoughts from Texts: Evaluating Continual Pretraining with Synthetic Data for LLM Reasoning

Add code
May 15, 2025
Viaarxiv icon

Can a Crow Hatch a Falcon? Lineage Matters in Predicting Large Language Model Performance

Add code
Apr 28, 2025
Viaarxiv icon

Can Large Language Models Invent Algorithms to Improve Themselves?

Add code
Oct 21, 2024
Figure 1 for Can Large Language Models Invent Algorithms to Improve Themselves?
Figure 2 for Can Large Language Models Invent Algorithms to Improve Themselves?
Figure 3 for Can Large Language Models Invent Algorithms to Improve Themselves?
Figure 4 for Can Large Language Models Invent Algorithms to Improve Themselves?
Viaarxiv icon