Picture for Xiaolong Huang

Xiaolong Huang

PyLO: Towards Accessible Learned Optimizers in PyTorch

Add code
Jun 12, 2025
Viaarxiv icon

MuLoCo: Muon is a practical inner optimizer for DiLoCo

Add code
May 29, 2025
Viaarxiv icon

Scaling Laws of Synthetic Data for Language Models

Add code
Mar 26, 2025
Viaarxiv icon

WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale

Add code
Feb 23, 2025
Figure 1 for WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
Figure 2 for WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
Figure 3 for WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
Figure 4 for WildLong: Synthesizing Realistic Long-Context Instruction Data at Scale
Viaarxiv icon

Chain-of-Retrieval Augmented Generation

Add code
Jan 24, 2025
Figure 1 for Chain-of-Retrieval Augmented Generation
Figure 2 for Chain-of-Retrieval Augmented Generation
Figure 3 for Chain-of-Retrieval Augmented Generation
Figure 4 for Chain-of-Retrieval Augmented Generation
Viaarxiv icon

Bootstrap Your Own Context Length

Add code
Dec 25, 2024
Viaarxiv icon

Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models

Add code
Feb 20, 2024
Viaarxiv icon

Multilingual E5 Text Embeddings: A Technical Report

Add code
Feb 08, 2024
Viaarxiv icon

One Step Learning, One Step Review

Add code
Jan 19, 2024
Figure 1 for One Step Learning, One Step Review
Figure 2 for One Step Learning, One Step Review
Figure 3 for One Step Learning, One Step Review
Figure 4 for One Step Learning, One Step Review
Viaarxiv icon

Improving Text Embeddings with Large Language Models

Add code
Dec 31, 2023
Figure 1 for Improving Text Embeddings with Large Language Models
Figure 2 for Improving Text Embeddings with Large Language Models
Figure 3 for Improving Text Embeddings with Large Language Models
Figure 4 for Improving Text Embeddings with Large Language Models
Viaarxiv icon