Picture for Bin Cui

Bin Cui

PSEO: Optimizing Post-hoc Stacking Ensemble Through Hyperparameter Tuning

Add code
Aug 07, 2025
Viaarxiv icon

PilotRL: Training Language Model Agents via Global Planning-Guided Progressive Reinforcement Learning

Add code
Aug 01, 2025
Viaarxiv icon

Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions

Add code
Jun 09, 2025
Viaarxiv icon

LogicPuzzleRL: Cultivating Robust Mathematical Reasoning in LLMs via Reinforcement Learning

Add code
Jun 05, 2025
Viaarxiv icon

SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling

Add code
May 30, 2025
Viaarxiv icon

LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts

Add code
May 20, 2025
Viaarxiv icon

Let's Verify Math Questions Step by Step

Add code
May 20, 2025
Viaarxiv icon

Thinking Short and Right Over Thinking Long: Serving LLM Reasoning Efficiently and Accurately

Add code
May 19, 2025
Viaarxiv icon

SAS-Bench: A Fine-Grained Benchmark for Evaluating Short Answer Scoring with Large Language Models

Add code
May 15, 2025
Viaarxiv icon

Galvatron: An Automatic Distributed System for Efficient Foundation Model Training

Add code
Apr 30, 2025
Figure 1 for Galvatron: An Automatic Distributed System for Efficient Foundation Model Training
Figure 2 for Galvatron: An Automatic Distributed System for Efficient Foundation Model Training
Viaarxiv icon