Picture for Yuheng Zhang

Yuheng Zhang

ArtVIP: Articulated Digital Assets of Visual Realism, Modular Interaction, and Physical Fidelity for Robot Learning

Add code
Jun 06, 2025
Viaarxiv icon

CryoCCD: Conditional Cycle-consistent Diffusion with Biophysical Modeling for Cryo-EM Synthesis

Add code
May 29, 2025
Viaarxiv icon

Panoramic Out-of-Distribution Segmentation

Add code
May 06, 2025
Viaarxiv icon

Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs

Add code
Mar 03, 2025
Viaarxiv icon

Improving LLM General Preference Alignment via Optimistic Online Mirror Descent

Add code
Feb 24, 2025
Figure 1 for Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Figure 2 for Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Figure 3 for Improving LLM General Preference Alignment via Optimistic Online Mirror Descent
Viaarxiv icon

Teaching LLMs to Refine with Tools

Add code
Dec 22, 2024
Figure 1 for Teaching LLMs to Refine with Tools
Figure 2 for Teaching LLMs to Refine with Tools
Figure 3 for Teaching LLMs to Refine with Tools
Figure 4 for Teaching LLMs to Refine with Tools
Viaarxiv icon

Noise Matters: Diffusion Model-based Urban Mobility Generation with Collaborative Noise Priors

Add code
Dec 06, 2024
Viaarxiv icon

Understanding World or Predicting Future? A Comprehensive Survey of World Models

Add code
Nov 21, 2024
Figure 1 for Understanding World or Predicting Future? A Comprehensive Survey of World Models
Figure 2 for Understanding World or Predicting Future? A Comprehensive Survey of World Models
Figure 3 for Understanding World or Predicting Future? A Comprehensive Survey of World Models
Figure 4 for Understanding World or Predicting Future? A Comprehensive Survey of World Models
Viaarxiv icon

Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning

Add code
Jun 30, 2024
Figure 1 for Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Figure 2 for Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Figure 3 for Iterative Nash Policy Optimization: Aligning LLMs with General Preferences via No-Regret Learning
Viaarxiv icon

LCSim: A Large-Scale Controllable Traffic Simulator

Add code
Jun 28, 2024
Figure 1 for LCSim: A Large-Scale Controllable Traffic Simulator
Figure 2 for LCSim: A Large-Scale Controllable Traffic Simulator
Figure 3 for LCSim: A Large-Scale Controllable Traffic Simulator
Figure 4 for LCSim: A Large-Scale Controllable Traffic Simulator
Viaarxiv icon