Picture for Ye Shi

Ye Shi

OpenHOI: Open-World Hand-Object Interaction Synthesis with Multimodal Large Language Model

Add code
May 25, 2025
Viaarxiv icon

One Policy but Many Worlds: A Scalable Unified Policy for Versatile Humanoid Locomotion

Add code
May 24, 2025
Viaarxiv icon

GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning

Add code
May 24, 2025
Viaarxiv icon

UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control

Add code
Feb 09, 2025
Figure 1 for UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control
Figure 2 for UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control
Figure 3 for UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control
Figure 4 for UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control
Viaarxiv icon

Evaluating Image Caption via Cycle-consistent Text-to-Image Generation

Add code
Jan 08, 2025
Figure 1 for Evaluating Image Caption via Cycle-consistent Text-to-Image Generation
Figure 2 for Evaluating Image Caption via Cycle-consistent Text-to-Image Generation
Figure 3 for Evaluating Image Caption via Cycle-consistent Text-to-Image Generation
Figure 4 for Evaluating Image Caption via Cycle-consistent Text-to-Image Generation
Viaarxiv icon

AffordDP: Generalizable Diffusion Policy with Transferable Affordance

Add code
Dec 04, 2024
Figure 1 for AffordDP: Generalizable Diffusion Policy with Transferable Affordance
Figure 2 for AffordDP: Generalizable Diffusion Policy with Transferable Affordance
Figure 3 for AffordDP: Generalizable Diffusion Policy with Transferable Affordance
Figure 4 for AffordDP: Generalizable Diffusion Policy with Transferable Affordance
Viaarxiv icon

SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model

Add code
Dec 02, 2024
Figure 1 for SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model
Figure 2 for SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model
Figure 3 for SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model
Figure 4 for SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model
Viaarxiv icon

NLPrompt: Noise-Label Prompt Learning for Vision-Language Models

Add code
Dec 02, 2024
Figure 1 for NLPrompt: Noise-Label Prompt Learning for Vision-Language Models
Figure 2 for NLPrompt: Noise-Label Prompt Learning for Vision-Language Models
Figure 3 for NLPrompt: Noise-Label Prompt Learning for Vision-Language Models
Figure 4 for NLPrompt: Noise-Label Prompt Learning for Vision-Language Models
Viaarxiv icon

Understanding Representation of Deep Equilibrium Models from Neural Collapse Perspective

Add code
Oct 30, 2024
Viaarxiv icon

Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method

Add code
Sep 29, 2024
Figure 1 for Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method
Figure 2 for Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method
Figure 3 for Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method
Figure 4 for Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method
Viaarxiv icon