Picture for Xiyao Wang

Xiyao Wang

University of Toronto

Multi-Preference Lambda-weighted Listwise DPO for Dynamic Preference Alignment

Add code
Jun 24, 2025
Viaarxiv icon

ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs

Add code
Jun 11, 2025
Viaarxiv icon

What makes Reasoning Models Different? Follow the Reasoning Leader for Efficient Decoding

Add code
Jun 08, 2025
Viaarxiv icon

MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning

Add code
Jun 05, 2025
Viaarxiv icon

DISCO Balances the Scales: Adaptive Domain- and Difficulty-Aware Reinforcement Learning on Imbalanced Data

Add code
May 21, 2025
Viaarxiv icon

SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement

Add code
Apr 10, 2025
Viaarxiv icon

Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning

Add code
Oct 09, 2024
Figure 1 for Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning
Figure 2 for Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning
Figure 3 for Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning
Figure 4 for Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning
Viaarxiv icon

LLaVA-Critic: Learning to Evaluate Multimodal Models

Add code
Oct 03, 2024
Viaarxiv icon

Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation

Add code
Jun 19, 2024
Figure 1 for Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation
Figure 2 for Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation
Figure 3 for Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation
Figure 4 for Multi-Stage Balanced Distillation: Addressing Long-Tail Challenges in Sequence-Level Knowledge Distillation
Viaarxiv icon

World Models with Hints of Large Language Models for Goal Achieving

Add code
Jun 11, 2024
Figure 1 for World Models with Hints of Large Language Models for Goal Achieving
Figure 2 for World Models with Hints of Large Language Models for Goal Achieving
Figure 3 for World Models with Hints of Large Language Models for Goal Achieving
Figure 4 for World Models with Hints of Large Language Models for Goal Achieving
Viaarxiv icon