Picture for Tianjian Li

Tianjian Li

ZeCO: Zero Communication Overhead Sequence Parallelism for Linear Attention

Add code
Jul 02, 2025
Viaarxiv icon

SIMPLEMIX: Frustratingly Simple Mixing of Off- and On-policy Data in Language Model Preference Learning

Add code
May 05, 2025
Viaarxiv icon

Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets

Add code
Oct 06, 2024
Figure 1 for Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Figure 2 for Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Figure 3 for Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Figure 4 for Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Viaarxiv icon

Benchmarking Language Model Creativity: A Case Study on Code Generation

Add code
Jul 12, 2024
Viaarxiv icon

Verifiable by Design: Aligning Language Models to Quote from Pre-Training Data

Add code
Apr 05, 2024
Viaarxiv icon

Error Norm Truncation: Robust Training in the Presence of Data Noise for Text Generation Models

Add code
Oct 02, 2023
Viaarxiv icon

Simple yet Effective Code-Switching Language Identification with Multitask Pre-Training and Transfer Learning

Add code
May 31, 2023
Viaarxiv icon

Why Does Zero-Shot Cross-Lingual Generation Fail? An Explanation and a Solution

Add code
May 27, 2023
Viaarxiv icon