Picture for Hanlin Zhang

Hanlin Zhang

DSA-Tokenizer: Disentangled Semantic-Acoustic Tokenization via Flow Matching-based Hierarchical Fusion

Add code
Jan 15, 2026
Viaarxiv icon

Discovering Hierarchical Latent Capabilities of Language Models via Causal Representation Learning

Add code
Jun 12, 2025
Viaarxiv icon

Connections between Schedule-Free Optimizers, AdEMAMix, and Accelerated SGD Variants

Add code
Feb 04, 2025
Figure 1 for Connections between Schedule-Free Optimizers, AdEMAMix, and Accelerated SGD Variants
Figure 2 for Connections between Schedule-Free Optimizers, AdEMAMix, and Accelerated SGD Variants
Figure 3 for Connections between Schedule-Free Optimizers, AdEMAMix, and Accelerated SGD Variants
Viaarxiv icon

Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models

Add code
Dec 03, 2024
Viaarxiv icon

How Does Critical Batch Size Scale in Pre-training?

Add code
Oct 29, 2024
Figure 1 for How Does Critical Batch Size Scale in Pre-training?
Figure 2 for How Does Critical Batch Size Scale in Pre-training?
Figure 3 for How Does Critical Batch Size Scale in Pre-training?
Figure 4 for How Does Critical Batch Size Scale in Pre-training?
Viaarxiv icon

Eliminating Position Bias of Language Models: A Mechanistic Approach

Add code
Jul 01, 2024
Figure 1 for Eliminating Position Bias of Language Models: A Mechanistic Approach
Figure 2 for Eliminating Position Bias of Language Models: A Mechanistic Approach
Figure 3 for Eliminating Position Bias of Language Models: A Mechanistic Approach
Figure 4 for Eliminating Position Bias of Language Models: A Mechanistic Approach
Viaarxiv icon

DataComp-LM: In search of the next generation of training sets for language models

Add code
Jun 18, 2024
Figure 1 for DataComp-LM: In search of the next generation of training sets for language models
Figure 2 for DataComp-LM: In search of the next generation of training sets for language models
Figure 3 for DataComp-LM: In search of the next generation of training sets for language models
Figure 4 for DataComp-LM: In search of the next generation of training sets for language models
Viaarxiv icon

CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-training

Add code
Jun 15, 2024
Viaarxiv icon

Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems

Add code
Feb 27, 2024
Viaarxiv icon

A Study on the Calibration of In-context Learning

Add code
Dec 11, 2023
Figure 1 for A Study on the Calibration of In-context Learning
Figure 2 for A Study on the Calibration of In-context Learning
Figure 3 for A Study on the Calibration of In-context Learning
Figure 4 for A Study on the Calibration of In-context Learning
Viaarxiv icon