Picture for Yadong Mu

Yadong Mu

Columbia University

Local Occupancy-Enhanced Object Grasping with Multiple Triplanar Projection

Add code
Jul 22, 2024
Viaarxiv icon

InstructLayout: Instruction-Driven 2D and 3D Layout Synthesis with Semantic Graph Prior

Add code
Jul 11, 2024
Viaarxiv icon

HumanSplat: Generalizable Single-Image Human Gaussian Splatting with Structure Priors

Add code
Jun 18, 2024
Viaarxiv icon

Learning Solution-Aware Transformers for Efficiently Solving Quadratic Assignment Problem

Add code
Jun 14, 2024
Viaarxiv icon

RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance

Add code
May 23, 2024
Figure 1 for RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance
Figure 2 for RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance
Figure 3 for RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance
Figure 4 for RectifID: Personalizing Rectified Flow with Anchored Classifier Guidance
Viaarxiv icon

Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images

Add code
Apr 25, 2024
Figure 1 for Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images
Figure 2 for Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images
Figure 3 for Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images
Figure 4 for Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images
Viaarxiv icon

Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion

Add code
Apr 17, 2024
Figure 1 for Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion
Figure 2 for Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion
Figure 3 for Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion
Figure 4 for Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion
Viaarxiv icon

InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior

Add code
Feb 07, 2024
Viaarxiv icon

Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization

Add code
Feb 06, 2024
Figure 1 for Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Figure 2 for Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Figure 3 for Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Figure 4 for Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Viaarxiv icon

Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization

Add code
Sep 29, 2023
Figure 1 for Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
Figure 2 for Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
Figure 3 for Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
Figure 4 for Unified Language-Vision Pretraining in LLM with Dynamic Discrete Visual Tokenization
Viaarxiv icon