Picture for Shizhe Chen

Shizhe Chen

INRIA

HO-Flow: Generalizable Hand-Object Interaction Generation with Latent Flow Matching

Add code
Apr 12, 2026
Viaarxiv icon

FIRE-CIR: Fine-grained Reasoning for Composed Fashion Image Retrieval

Add code
Apr 10, 2026
Viaarxiv icon

MAGICIAN: Efficient Long-Term Planning with Imagined Gaussians for Active Mapping

Add code
Mar 23, 2026
Viaarxiv icon

Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation

Add code
Jun 12, 2025
Figure 1 for Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation
Figure 2 for Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation
Figure 3 for Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation
Figure 4 for Gondola: Grounded Vision Language Planning for Generalizable Robotic Manipulation
Viaarxiv icon

ComposeAnything: Composite Object Priors for Text-to-Image Generation

Add code
May 30, 2025
Figure 1 for ComposeAnything: Composite Object Priors for Text-to-Image Generation
Figure 2 for ComposeAnything: Composite Object Priors for Text-to-Image Generation
Figure 3 for ComposeAnything: Composite Object Priors for Text-to-Image Generation
Figure 4 for ComposeAnything: Composite Object Priors for Text-to-Image Generation
Viaarxiv icon

HORT: Monocular Hand-held Objects Reconstruction with Transformers

Add code
Mar 27, 2025
Figure 1 for HORT: Monocular Hand-held Objects Reconstruction with Transformers
Figure 2 for HORT: Monocular Hand-held Objects Reconstruction with Transformers
Figure 3 for HORT: Monocular Hand-held Objects Reconstruction with Transformers
Figure 4 for HORT: Monocular Hand-held Objects Reconstruction with Transformers
Viaarxiv icon

Online 3D Scene Reconstruction Using Neural Object Priors

Add code
Mar 24, 2025
Viaarxiv icon

NextBestPath: Efficient 3D Mapping of Unseen Environments

Add code
Feb 07, 2025
Figure 1 for NextBestPath: Efficient 3D Mapping of Unseen Environments
Figure 2 for NextBestPath: Efficient 3D Mapping of Unseen Environments
Figure 3 for NextBestPath: Efficient 3D Mapping of Unseen Environments
Figure 4 for NextBestPath: Efficient 3D Mapping of Unseen Environments
Viaarxiv icon

Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy

Add code
Oct 02, 2024
Figure 1 for Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy
Figure 2 for Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy
Figure 3 for Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy
Figure 4 for Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy
Viaarxiv icon

Conan-embedding: General Text Embedding with More and Better Negative Samples

Add code
Aug 29, 2024
Figure 1 for Conan-embedding: General Text Embedding with More and Better Negative Samples
Figure 2 for Conan-embedding: General Text Embedding with More and Better Negative Samples
Figure 3 for Conan-embedding: General Text Embedding with More and Better Negative Samples
Figure 4 for Conan-embedding: General Text Embedding with More and Better Negative Samples
Viaarxiv icon