Picture for Rui Men

Rui Men

additional authors not shown

MTR-Bench: A Comprehensive Benchmark for Multi-Turn Reasoning Evaluation

Add code
May 26, 2025
Viaarxiv icon

Qwen3 Technical Report

Add code
May 14, 2025
Viaarxiv icon

Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free

Add code
May 10, 2025
Viaarxiv icon

HellaSwag-Pro: A Large-Scale Bilingual Benchmark for Evaluating the Robustness of LLMs in Commonsense Reasoning

Add code
Feb 17, 2025
Viaarxiv icon

Qwen2.5-1M Technical Report

Add code
Jan 26, 2025
Viaarxiv icon

Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models

Add code
Jan 21, 2025
Viaarxiv icon

Qwen2.5 Technical Report

Add code
Dec 19, 2024
Viaarxiv icon

Qwen2.5-Coder Technical Report

Add code
Sep 18, 2024
Figure 1 for Qwen2.5-Coder Technical Report
Figure 2 for Qwen2.5-Coder Technical Report
Figure 3 for Qwen2.5-Coder Technical Report
Figure 4 for Qwen2.5-Coder Technical Report
Viaarxiv icon

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Add code
Sep 18, 2024
Figure 1 for Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Figure 2 for Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Figure 3 for Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Figure 4 for Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution
Viaarxiv icon

Qwen2 Technical Report

Add code
Jul 16, 2024
Figure 1 for Qwen2 Technical Report
Figure 2 for Qwen2 Technical Report
Figure 3 for Qwen2 Technical Report
Figure 4 for Qwen2 Technical Report
Viaarxiv icon