Picture for Bin Xu

Bin Xu

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Add code
Aug 08, 2025
Viaarxiv icon

Channel-Independent Federated Traffic Prediction

Add code
Aug 06, 2025
Viaarxiv icon

GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning

Add code
Jul 02, 2025
Viaarxiv icon

VerIF: Verification Engineering for Reinforcement Learning in Instruction Following

Add code
Jun 11, 2025
Viaarxiv icon

From Swath to Full-Disc: Advancing Precipitation Retrieval with Multimodal Knowledge Expansion

Add code
Jun 08, 2025
Viaarxiv icon

S2R-Bench: A Sim-to-Real Evaluation Benchmark for Autonomous Driving

Add code
May 24, 2025
Viaarxiv icon

BLAST: Balanced Sampling Time Series Corpus for Universal Forecasting Models

Add code
May 23, 2025
Viaarxiv icon

AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios

Add code
May 22, 2025
Viaarxiv icon

EduBench: A Comprehensive Benchmarking Dataset for Evaluating Large Language Models in Diverse Educational Scenarios

Add code
May 22, 2025
Viaarxiv icon

DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models

Add code
May 13, 2025
Viaarxiv icon