Picture for Hao Li

Hao Li

Jack

From Monocular Vision to Autonomous Action: Guiding Tumor Resection via 3D Reconstruction

Add code
Mar 20, 2025
Viaarxiv icon

DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis

Add code
Mar 19, 2025
Figure 1 for DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis
Figure 2 for DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis
Figure 3 for DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis
Figure 4 for DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis
Viaarxiv icon

Let Synthetic Data Shine: Domain Reassembly and Soft-Fusion for Single Domain Generalization

Add code
Mar 17, 2025
Viaarxiv icon

Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation

Add code
Mar 13, 2025
Figure 1 for Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation
Figure 2 for Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation
Figure 3 for Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation
Figure 4 for Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation
Viaarxiv icon

GoT: Unleashing Reasoning Capability of Multimodal Large Language Model for Visual Generation and Editing

Add code
Mar 13, 2025
Viaarxiv icon

Astrea: A MOE-based Visual Understanding Model with Progressive Alignment

Add code
Mar 12, 2025
Figure 1 for Astrea: A MOE-based Visual Understanding Model with Progressive Alignment
Figure 2 for Astrea: A MOE-based Visual Understanding Model with Progressive Alignment
Figure 3 for Astrea: A MOE-based Visual Understanding Model with Progressive Alignment
Figure 4 for Astrea: A MOE-based Visual Understanding Model with Progressive Alignment
Viaarxiv icon

Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption

Add code
Mar 12, 2025
Figure 1 for Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption
Figure 2 for Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption
Figure 3 for Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption
Figure 4 for Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption
Viaarxiv icon

Silent Hazards of Token Reduction in Vision-Language Models: The Hidden Impact on Consistency

Add code
Mar 11, 2025
Figure 1 for Silent Hazards of Token Reduction in Vision-Language Models: The Hidden Impact on Consistency
Figure 2 for Silent Hazards of Token Reduction in Vision-Language Models: The Hidden Impact on Consistency
Figure 3 for Silent Hazards of Token Reduction in Vision-Language Models: The Hidden Impact on Consistency
Figure 4 for Silent Hazards of Token Reduction in Vision-Language Models: The Hidden Impact on Consistency
Viaarxiv icon

Materials Map Integrating Experimental and Computational Data through Graph-Based Machine Learning for Enhanced Materials Discovery

Add code
Mar 11, 2025
Viaarxiv icon

SARA: Structural and Adversarial Representation Alignment for Training-efficient Diffusion Models

Add code
Mar 11, 2025
Figure 1 for SARA: Structural and Adversarial Representation Alignment for Training-efficient Diffusion Models
Figure 2 for SARA: Structural and Adversarial Representation Alignment for Training-efficient Diffusion Models
Figure 3 for SARA: Structural and Adversarial Representation Alignment for Training-efficient Diffusion Models
Figure 4 for SARA: Structural and Adversarial Representation Alignment for Training-efficient Diffusion Models
Viaarxiv icon