Picture for Yibin Wang

Yibin Wang

GeometryZero: Improving Geometry Solving for LLM with Group Contrastive Policy Optimization

Add code
Jun 08, 2025
Viaarxiv icon

Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay

Add code
Jun 05, 2025
Viaarxiv icon

Token-Level Uncertainty Estimation for Large Language Model Reasoning

Add code
May 16, 2025
Viaarxiv icon

Efficient Uncertainty Estimation via Distillation of Bayesian Large Language Models

Add code
May 16, 2025
Viaarxiv icon

Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning

Add code
May 06, 2025
Viaarxiv icon

EduBot -- Can LLMs Solve Personalized Learning and Programming Assignments?

Add code
Apr 23, 2025
Viaarxiv icon

Unified Reward Model for Multimodal Understanding and Generation

Add code
Mar 07, 2025
Viaarxiv icon

Training-Free Bayesianization for Low-Rank Adapters of Large Language Models

Add code
Dec 07, 2024
Figure 1 for Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Figure 2 for Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Figure 3 for Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Figure 4 for Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Viaarxiv icon

LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment

Add code
Dec 06, 2024
Viaarxiv icon

MagicFace: Training-free Universal-Style Human Image Customized Synthesis

Add code
Aug 15, 2024
Viaarxiv icon