Picture for Chengliang Chai

Chengliang Chai

Not All Documents Are What You Need for Extracting Instruction Tuning Data

Add code
May 18, 2025
Viaarxiv icon

LEAD: Iterative Data Selection for Efficient LLM Instruction Tuning

Add code
May 12, 2025
Viaarxiv icon

Handling Label Noise via Instance-Level Difficulty Modeling and Dynamic Optimization

Add code
May 01, 2025
Viaarxiv icon

Harnessing Diversity for Important Data Selection in Pretraining Large Language Models

Add code
Sep 25, 2024
Viaarxiv icon

Cost-Effective In-Context Learning for Entity Resolution: A Design Space Exploration

Add code
Dec 07, 2023
Viaarxiv icon

Crowd-Powered Data Mining

Add code
Oct 19, 2018
Figure 1 for Crowd-Powered Data Mining
Figure 2 for Crowd-Powered Data Mining
Viaarxiv icon