Picture for Mingyang Song

Mingyang Song

TAT-R1: Terminology-Aware Translation with Reinforcement Learning and Word Alignment

Add code
May 27, 2025
Viaarxiv icon

Walk Before You Run! Concise LLM Reasoning via Reinforcement Learning

Add code
May 27, 2025
Viaarxiv icon

SSR-Zero: Simple Self-Rewarding Reinforcement Learning for Machine Translation

Add code
May 22, 2025
Viaarxiv icon

OpenThinkIMG: Learning to Think with Images via Visual Tool Reinforcement Learning

Add code
May 13, 2025
Viaarxiv icon

FastCuRL: Curriculum Reinforcement Learning with Progressive Context Extension for Efficient Training R1-like Reasoning Models

Add code
Mar 21, 2025
Viaarxiv icon

From Head to Tail: Towards Balanced Representation in Large Vision-Language Models through Adaptive Data Calibration

Add code
Mar 17, 2025
Viaarxiv icon

GRP: Goal-Reversed Prompting for Zero-Shot Evaluation with LLMs

Add code
Mar 08, 2025
Viaarxiv icon

PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models

Add code
Jan 07, 2025
Figure 1 for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models
Figure 2 for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models
Figure 3 for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models
Figure 4 for PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models
Viaarxiv icon

A Survey of Query Optimization in Large Language Models

Add code
Dec 23, 2024
Viaarxiv icon

MiMoTable: A Multi-scale Spreadsheet Benchmark with Meta Operations for Table Reasoning

Add code
Dec 16, 2024
Viaarxiv icon