Picture for Taiji Suzuki

Taiji Suzuki

Quantifying Memory Utilization with Effective State-Size

Add code
Apr 28, 2025
Viaarxiv icon

When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars

Add code
Apr 24, 2025
Viaarxiv icon

Propagation of Chaos for Mean-Field Langevin Dynamics and its Application to Model Ensemble

Add code
Feb 09, 2025
Viaarxiv icon

Direct Distributional Optimization for Provable Alignment of Diffusion Models

Add code
Feb 05, 2025
Figure 1 for Direct Distributional Optimization for Provable Alignment of Diffusion Models
Figure 2 for Direct Distributional Optimization for Provable Alignment of Diffusion Models
Figure 3 for Direct Distributional Optimization for Provable Alignment of Diffusion Models
Figure 4 for Direct Distributional Optimization for Provable Alignment of Diffusion Models
Viaarxiv icon

Metastable Dynamics of Chain-of-Thought Reasoning: Provable Benefits of Search, RL and Distillation

Add code
Feb 02, 2025
Viaarxiv icon

Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression

Add code
Jan 09, 2025
Figure 1 for Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression
Figure 2 for Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression
Viaarxiv icon

On the Comparison between Multi-modal and Single-modal Contrastive Learning

Add code
Nov 05, 2024
Viaarxiv icon

Provably Transformers Harness Multi-Concept Word Semantics for Efficient In-Context Learning

Add code
Nov 04, 2024
Viaarxiv icon

Pretrained transformer efficiently learns low-dimensional target functions in-context

Add code
Nov 04, 2024
Viaarxiv icon

Dimensionality-induced information loss of outliers in deep neural networks

Add code
Oct 29, 2024
Viaarxiv icon