Picture for Haiwen Feng

Haiwen Feng

Vision-as-Inverse-Graphics Agent via Interleaved Multimodal Reasoning

Add code
Jan 16, 2026
Viaarxiv icon

Visually Prompted Benchmarks Are Surprisingly Fragile

Add code
Dec 19, 2025
Figure 1 for Visually Prompted Benchmarks Are Surprisingly Fragile
Figure 2 for Visually Prompted Benchmarks Are Surprisingly Fragile
Figure 3 for Visually Prompted Benchmarks Are Surprisingly Fragile
Figure 4 for Visually Prompted Benchmarks Are Surprisingly Fragile
Viaarxiv icon

Flow Matching Policy Gradients

Add code
Jul 28, 2025
Viaarxiv icon

St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World

Add code
Apr 17, 2025
Figure 1 for St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World
Figure 2 for St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World
Figure 3 for St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World
Figure 4 for St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World
Viaarxiv icon

ETCH: Generalizing Body Fitting to Clothed Humans via Equivariant Tightness

Add code
Mar 13, 2025
Viaarxiv icon

Predicting 4D Hand Trajectory from Monocular Videos

Add code
Jan 14, 2025
Figure 1 for Predicting 4D Hand Trajectory from Monocular Videos
Figure 2 for Predicting 4D Hand Trajectory from Monocular Videos
Figure 3 for Predicting 4D Hand Trajectory from Monocular Videos
Figure 4 for Predicting 4D Hand Trajectory from Monocular Videos
Viaarxiv icon

InterDyn: Controllable Interactive Dynamics with Video Diffusion Models

Add code
Dec 16, 2024
Figure 1 for InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Figure 2 for InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Figure 3 for InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Figure 4 for InterDyn: Controllable Interactive Dynamics with Video Diffusion Models
Viaarxiv icon

GenLit: Reformulating Single-Image Relighting as Video Generation

Add code
Dec 15, 2024
Figure 1 for GenLit: Reformulating Single-Image Relighting as Video Generation
Figure 2 for GenLit: Reformulating Single-Image Relighting as Video Generation
Figure 3 for GenLit: Reformulating Single-Image Relighting as Video Generation
Figure 4 for GenLit: Reformulating Single-Image Relighting as Video Generation
Viaarxiv icon

Toward Human Understanding with Controllable Synthesis

Add code
Nov 13, 2024
Figure 1 for Toward Human Understanding with Controllable Synthesis
Figure 2 for Toward Human Understanding with Controllable Synthesis
Figure 3 for Toward Human Understanding with Controllable Synthesis
Figure 4 for Toward Human Understanding with Controllable Synthesis
Viaarxiv icon

Can Large Language Models Understand Symbolic Graphics Programs?

Add code
Aug 15, 2024
Figure 1 for Can Large Language Models Understand Symbolic Graphics Programs?
Figure 2 for Can Large Language Models Understand Symbolic Graphics Programs?
Figure 3 for Can Large Language Models Understand Symbolic Graphics Programs?
Figure 4 for Can Large Language Models Understand Symbolic Graphics Programs?
Viaarxiv icon