Picture for Xiaokang Yang

Xiaokang Yang

POLAR: A Portrait OLAT Dataset and Generative Framework for Illumination-Aware Face Modeling

Add code
Dec 16, 2025
Viaarxiv icon

MoRE: 3D Visual Geometry Reconstruction Meets Mixture-of-Experts

Add code
Oct 31, 2025
Viaarxiv icon

Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method

Add code
Oct 27, 2025
Figure 1 for Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method
Figure 2 for Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method
Figure 3 for Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method
Figure 4 for Scaling Up Occupancy-centric Driving Scene Generation: Dataset and Method
Viaarxiv icon

FieldGen: From Teleoperated Pre-Manipulation Trajectories to Field-Guided Data Generation

Add code
Oct 23, 2025
Figure 1 for FieldGen: From Teleoperated Pre-Manipulation Trajectories to Field-Guided Data Generation
Figure 2 for FieldGen: From Teleoperated Pre-Manipulation Trajectories to Field-Guided Data Generation
Figure 3 for FieldGen: From Teleoperated Pre-Manipulation Trajectories to Field-Guided Data Generation
Figure 4 for FieldGen: From Teleoperated Pre-Manipulation Trajectories to Field-Guided Data Generation
Viaarxiv icon

Expertise need not monopolize: Action-Specialized Mixture of Experts for Vision-Language-Action Learning

Add code
Oct 16, 2025
Viaarxiv icon

Flow-Matching Guided Deep Unfolding for Hyperspectral Image Reconstruction

Add code
Oct 02, 2025
Figure 1 for Flow-Matching Guided Deep Unfolding for Hyperspectral Image Reconstruction
Figure 2 for Flow-Matching Guided Deep Unfolding for Hyperspectral Image Reconstruction
Figure 3 for Flow-Matching Guided Deep Unfolding for Hyperspectral Image Reconstruction
Figure 4 for Flow-Matching Guided Deep Unfolding for Hyperspectral Image Reconstruction
Viaarxiv icon

FlashEdit: Decoupling Speed, Structure, and Semantics for Precise Image Editing

Add code
Sep 26, 2025
Viaarxiv icon

Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies

Add code
Aug 27, 2025
Viaarxiv icon

NEARL-CLIP: Interacted Query Adaptation with Orthogonal Regularization for Medical Vision-Language Understanding

Add code
Aug 06, 2025
Figure 1 for NEARL-CLIP: Interacted Query Adaptation with Orthogonal Regularization for Medical Vision-Language Understanding
Figure 2 for NEARL-CLIP: Interacted Query Adaptation with Orthogonal Regularization for Medical Vision-Language Understanding
Figure 3 for NEARL-CLIP: Interacted Query Adaptation with Orthogonal Regularization for Medical Vision-Language Understanding
Figure 4 for NEARL-CLIP: Interacted Query Adaptation with Orthogonal Regularization for Medical Vision-Language Understanding
Viaarxiv icon

Perceiving and Acting in First-Person: A Dataset and Benchmark for Egocentric Human-Object-Human Interactions

Add code
Aug 06, 2025
Figure 1 for Perceiving and Acting in First-Person: A Dataset and Benchmark for Egocentric Human-Object-Human Interactions
Figure 2 for Perceiving and Acting in First-Person: A Dataset and Benchmark for Egocentric Human-Object-Human Interactions
Figure 3 for Perceiving and Acting in First-Person: A Dataset and Benchmark for Egocentric Human-Object-Human Interactions
Figure 4 for Perceiving and Acting in First-Person: A Dataset and Benchmark for Egocentric Human-Object-Human Interactions
Viaarxiv icon