Picture for Qiaozhi He

Qiaozhi He

MSRL: Scaling Generative Multimodal Reward Modeling via Multi-Stage Reinforcement Learning

Add code
Mar 26, 2026
Viaarxiv icon

DaPT: A Dual-Path Framework for Multilingual Multi-hop Question Answering

Add code
Mar 19, 2026
Viaarxiv icon

When Scaling Fails: Mitigating Audio Perception Decay of LALMs via Multi-Step Perception-Aware Reasoning

Add code
Feb 28, 2026
Viaarxiv icon

APR: Penalizing Structural Redundancy in Large Reasoning Models via Anchor-based Process Rewards

Add code
Jan 31, 2026
Viaarxiv icon

SERM: Self-Evolving Relevance Model with Agent-Driven Learning from Massive Query Streams

Add code
Jan 14, 2026
Viaarxiv icon

Probing Preference Representations: A Multi-Dimensional Evaluation and Analysis Method for Reward Models

Add code
Nov 16, 2025
Viaarxiv icon

GRAM: A Generative Foundation Reward Model for Reward Generalization

Add code
Jun 18, 2025
Viaarxiv icon

StickMotion: Generating 3D Human Motions by Drawing a Stickman

Add code
Mar 05, 2025
Figure 1 for StickMotion: Generating 3D Human Motions by Drawing a Stickman
Figure 2 for StickMotion: Generating 3D Human Motions by Drawing a Stickman
Figure 3 for StickMotion: Generating 3D Human Motions by Drawing a Stickman
Figure 4 for StickMotion: Generating 3D Human Motions by Drawing a Stickman
Viaarxiv icon

Boosting Text-To-Image Generation via Multilingual Prompting in Large Multimodal Models

Add code
Jan 13, 2025
Viaarxiv icon

LRHP: Learning Representations for Human Preferences via Preference Pairs

Add code
Oct 06, 2024
Figure 1 for LRHP: Learning Representations for Human Preferences via Preference Pairs
Figure 2 for LRHP: Learning Representations for Human Preferences via Preference Pairs
Figure 3 for LRHP: Learning Representations for Human Preferences via Preference Pairs
Figure 4 for LRHP: Learning Representations for Human Preferences via Preference Pairs
Viaarxiv icon