Picture for Keisuke Sakaguchi

Keisuke Sakaguchi

Expert Evaluation of LLM's Open-Ended Legal Reasoning on the Japanese Bar Exam Writing Task

Add code
Apr 26, 2026
Viaarxiv icon

Unlocking Prompt Infilling Capability for Diffusion Language Models

Add code
Apr 04, 2026
Viaarxiv icon

Nodes Are Early, Edges Are Late: Probing Diagram Representations in Large Vision-Language Models

Add code
Mar 03, 2026
Viaarxiv icon

Can Language Models Handle a Non-Gregorian Calendar?

Add code
Sep 04, 2025
Viaarxiv icon

Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset

Add code
Mar 31, 2025
Figure 1 for Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset
Figure 2 for Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset
Figure 3 for Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset
Figure 4 for Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset
Viaarxiv icon

Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference

Add code
Jan 27, 2025
Figure 1 for Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference
Figure 2 for Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference
Figure 3 for Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference
Figure 4 for Weight-based Analysis of Detokenization in Language Models: Understanding the First Stage of Inference Without Inference
Viaarxiv icon

Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Step Reasoning

Add code
Dec 02, 2024
Figure 1 for Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Step Reasoning
Figure 2 for Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Step Reasoning
Figure 3 for Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Step Reasoning
Figure 4 for Think-to-Talk or Talk-to-Think? When LLMs Come Up with an Answer in Multi-Step Reasoning
Viaarxiv icon

Self-Training Meets Consistency: Improving LLMs' Reasoning With Consistency-Driven Rationale Evaluation

Add code
Nov 22, 2024
Figure 1 for Self-Training Meets Consistency: Improving LLMs' Reasoning With Consistency-Driven Rationale Evaluation
Figure 2 for Self-Training Meets Consistency: Improving LLMs' Reasoning With Consistency-Driven Rationale Evaluation
Figure 3 for Self-Training Meets Consistency: Improving LLMs' Reasoning With Consistency-Driven Rationale Evaluation
Figure 4 for Self-Training Meets Consistency: Improving LLMs' Reasoning With Consistency-Driven Rationale Evaluation
Viaarxiv icon

Empirical Analysis of Large Vision-Language Models against Goal Hijacking via Visual Prompt Injection

Add code
Aug 07, 2024
Viaarxiv icon

LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

Add code
Jul 04, 2024
Figure 1 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Figure 2 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Figure 3 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Figure 4 for LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs
Viaarxiv icon