Picture for Bingni Zhang

Bingni Zhang

InfoLaw: Information Scaling Laws for Large Language Models with Quality-Weighted Mixture Data and Repetition

Add code
May 04, 2026
Viaarxiv icon

Exploring Polyglot Harmony: On Multilingual Data Allocation for Large Language Models Pretraining

Add code
Sep 19, 2025
Viaarxiv icon

MuRating: A High Quality Data Selecting Approach to Multilingual Large Language Model Pretraining

Add code
Jul 02, 2025
Viaarxiv icon

MuBench: Assessment of Multilingual Capabilities of Large Language Models Across 61 Languages

Add code
Jun 24, 2025
Viaarxiv icon

Frame-Voyager: Learning to Query Frames for Video Large Language Models

Add code
Oct 07, 2024
Viaarxiv icon