Picture for Kazuki Yano

Kazuki Yano

Layerwise Importance Analysis of Feed-Forward Networks in Transformer-based Language Models

Add code
Aug 25, 2025
Viaarxiv icon

STEP: Staged Parameter-Efficient Pre-training for Large Language Models

Add code
Apr 05, 2025
Viaarxiv icon

Efficient Construction of Model Family through Progressive Training Using Model Expansion

Add code
Apr 01, 2025
Viaarxiv icon