Picture for Bin Cui

Bin Cui

Rethinking Text-to-SQL: Dynamic Multi-turn SQL Interaction for Real-world Database Exploration

Add code
Oct 30, 2025
Figure 1 for Rethinking Text-to-SQL: Dynamic Multi-turn SQL Interaction for Real-world Database Exploration
Figure 2 for Rethinking Text-to-SQL: Dynamic Multi-turn SQL Interaction for Real-world Database Exploration
Figure 3 for Rethinking Text-to-SQL: Dynamic Multi-turn SQL Interaction for Real-world Database Exploration
Figure 4 for Rethinking Text-to-SQL: Dynamic Multi-turn SQL Interaction for Real-world Database Exploration
Viaarxiv icon

PSEO: Optimizing Post-hoc Stacking Ensemble Through Hyperparameter Tuning

Add code
Aug 07, 2025
Viaarxiv icon

PilotRL: Training Language Model Agents via Global Planning-Guided Progressive Reinforcement Learning

Add code
Aug 01, 2025
Viaarxiv icon

Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions

Add code
Jun 09, 2025
Figure 1 for Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions
Figure 2 for Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions
Figure 3 for Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions
Figure 4 for Learning What Reinforcement Learning Can't: Interleaved Online Fine-Tuning for Hardest Questions
Viaarxiv icon

LogicPuzzleRL: Cultivating Robust Mathematical Reasoning in LLMs via Reinforcement Learning

Add code
Jun 05, 2025
Viaarxiv icon

SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling

Add code
May 30, 2025
Figure 1 for SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling
Figure 2 for SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling
Figure 3 for SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling
Figure 4 for SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling
Viaarxiv icon

Let's Verify Math Questions Step by Step

Add code
May 20, 2025
Viaarxiv icon

LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts

Add code
May 20, 2025
Figure 1 for LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts
Figure 2 for LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts
Figure 3 for LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts
Figure 4 for LoVR: A Benchmark for Long Video Retrieval in Multimodal Contexts
Viaarxiv icon

Thinking Short and Right Over Thinking Long: Serving LLM Reasoning Efficiently and Accurately

Add code
May 19, 2025
Viaarxiv icon

SAS-Bench: A Fine-Grained Benchmark for Evaluating Short Answer Scoring with Large Language Models

Add code
May 15, 2025
Viaarxiv icon