Picture for Jun Suzuki

Jun Suzuki

Optimal Sparsity of Mixture-of-Experts Language Models for Reasoning Tasks

Add code
Aug 26, 2025
Viaarxiv icon

Layerwise Importance Analysis of Feed-Forward Networks in Transformer-based Language Models

Add code
Aug 25, 2025
Viaarxiv icon

VDocRAG: Retrieval-Augmented Generation over Visually-Rich Documents

Add code
Apr 14, 2025
Viaarxiv icon

STEP: Staged Parameter-Efficient Pre-training for Large Language Models

Add code
Apr 05, 2025
Viaarxiv icon

Efficient Construction of Model Family through Progressive Training Using Model Expansion

Add code
Apr 01, 2025
Viaarxiv icon

Drop-Upcycling: Training Sparse Mixture of Experts with Partial Re-initialization

Add code
Feb 26, 2025
Viaarxiv icon

Reference-free Evaluation Metrics for Text Generation: A Survey

Add code
Jan 21, 2025
Figure 1 for Reference-free Evaluation Metrics for Text Generation: A Survey
Viaarxiv icon

Can Input Attributions Interpret the Inductive Reasoning Process Elicited in In-Context Learning?

Add code
Dec 20, 2024
Figure 1 for Can Input Attributions Interpret the Inductive Reasoning Process Elicited in In-Context Learning?
Figure 2 for Can Input Attributions Interpret the Inductive Reasoning Process Elicited in In-Context Learning?
Figure 3 for Can Input Attributions Interpret the Inductive Reasoning Process Elicited in In-Context Learning?
Figure 4 for Can Input Attributions Interpret the Inductive Reasoning Process Elicited in In-Context Learning?
Viaarxiv icon

Pruning Multilingual Large Language Models for Multilingual Inference

Add code
Sep 25, 2024
Viaarxiv icon

MQM-Chat: Multidimensional Quality Metrics for Chat Translation

Add code
Aug 29, 2024
Viaarxiv icon