Picture for Ye Shi

Ye Shi

OpenHOI: Open-World Hand-Object Interaction Synthesis with Multimodal Large Language Model

Add code
May 25, 2025
Viaarxiv icon

GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning

Add code
May 24, 2025
Viaarxiv icon

One Policy but Many Worlds: A Scalable Unified Policy for Versatile Humanoid Locomotion

Add code
May 24, 2025
Viaarxiv icon

UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control

Add code
Feb 09, 2025
Figure 1 for UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control
Figure 2 for UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control
Figure 3 for UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control
Figure 4 for UniDB: A Unified Diffusion Bridge Framework via Stochastic Optimal Control
Viaarxiv icon

Evaluating Image Caption via Cycle-consistent Text-to-Image Generation

Add code
Jan 08, 2025
Figure 1 for Evaluating Image Caption via Cycle-consistent Text-to-Image Generation
Figure 2 for Evaluating Image Caption via Cycle-consistent Text-to-Image Generation
Figure 3 for Evaluating Image Caption via Cycle-consistent Text-to-Image Generation
Figure 4 for Evaluating Image Caption via Cycle-consistent Text-to-Image Generation
Viaarxiv icon

AffordDP: Generalizable Diffusion Policy with Transferable Affordance

Add code
Dec 04, 2024
Figure 1 for AffordDP: Generalizable Diffusion Policy with Transferable Affordance
Figure 2 for AffordDP: Generalizable Diffusion Policy with Transferable Affordance
Figure 3 for AffordDP: Generalizable Diffusion Policy with Transferable Affordance
Figure 4 for AffordDP: Generalizable Diffusion Policy with Transferable Affordance
Viaarxiv icon

NLPrompt: Noise-Label Prompt Learning for Vision-Language Models

Add code
Dec 02, 2024
Figure 1 for NLPrompt: Noise-Label Prompt Learning for Vision-Language Models
Figure 2 for NLPrompt: Noise-Label Prompt Learning for Vision-Language Models
Figure 3 for NLPrompt: Noise-Label Prompt Learning for Vision-Language Models
Figure 4 for NLPrompt: Noise-Label Prompt Learning for Vision-Language Models
Viaarxiv icon

SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model

Add code
Dec 02, 2024
Figure 1 for SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model
Figure 2 for SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model
Figure 3 for SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model
Figure 4 for SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model
Viaarxiv icon

Understanding Representation of Deep Equilibrium Models from Neural Collapse Perspective

Add code
Oct 30, 2024
Viaarxiv icon

Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method

Add code
Sep 29, 2024
Figure 1 for Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method
Figure 2 for Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method
Figure 3 for Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method
Figure 4 for Federated Learning from Vision-Language Foundation Models: Theoretical Analysis and Method
Viaarxiv icon