Picture for Wenbo Su

Wenbo Su

Reinforcement Learning Optimization for Large-Scale Learning: An Efficient and User-Friendly Scaling Library

Add code
Jun 06, 2025
Viaarxiv icon

Weight Spectra Induced Efficient Model Adaptation

Add code
May 29, 2025
Viaarxiv icon

USB: A Comprehensive and Unified Safety Evaluation Benchmark for Multimodal Large Language Models

Add code
May 26, 2025
Viaarxiv icon

Beyond Safe Answers: A Benchmark for Evaluating True Risk Awareness in Large Reasoning Models

Add code
May 26, 2025
Viaarxiv icon

NAN: A Training-Free Solution to Coefficient Estimation in Model Merging

Add code
May 22, 2025
Viaarxiv icon

Think-J: Learning to Think for Generative LLM-as-a-Judge

Add code
May 20, 2025
Viaarxiv icon

Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation

Add code
Mar 20, 2025
Viaarxiv icon

ECKGBench: Benchmarking Large Language Models in E-commerce Leveraging Knowledge Graph

Add code
Mar 20, 2025
Viaarxiv icon

ChineseEcomQA: A Scalable E-commerce Concept Evaluation Benchmark for Large Language Models

Add code
Feb 27, 2025
Viaarxiv icon

Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning?

Add code
Feb 26, 2025
Viaarxiv icon