Picture for Hong Zhang

Hong Zhang

EliGen: Entity-Level Controlled Image Generation with Regional Attention

Add code
Jan 02, 2025
Figure 1 for EliGen: Entity-Level Controlled Image Generation with Regional Attention
Figure 2 for EliGen: Entity-Level Controlled Image Generation with Regional Attention
Figure 3 for EliGen: Entity-Level Controlled Image Generation with Regional Attention
Figure 4 for EliGen: Entity-Level Controlled Image Generation with Regional Attention
Viaarxiv icon

SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer

Add code
Dec 16, 2024
Figure 1 for SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer
Figure 2 for SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer
Figure 3 for SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer
Figure 4 for SP$^2$T: Sparse Proxy Attention for Dual-stream Point Transformer
Viaarxiv icon

Semi-Implicit Neural Ordinary Differential Equations

Add code
Dec 15, 2024
Viaarxiv icon

Optimizing NeRF-based SLAM with Trajectory Smoothness Constraints

Add code
Oct 11, 2024
Figure 1 for Optimizing NeRF-based SLAM with Trajectory Smoothness Constraints
Figure 2 for Optimizing NeRF-based SLAM with Trajectory Smoothness Constraints
Figure 3 for Optimizing NeRF-based SLAM with Trajectory Smoothness Constraints
Figure 4 for Optimizing NeRF-based SLAM with Trajectory Smoothness Constraints
Viaarxiv icon

RTAGrasp: Learning Task-Oriented Grasping from Human Videos via Retrieval, Transfer, and Alignment

Add code
Sep 24, 2024
Viaarxiv icon

PISR: Polarimetric Neural Implicit Surface Reconstruction for Textureless and Specular Objects

Add code
Sep 22, 2024
Figure 1 for PISR: Polarimetric Neural Implicit Surface Reconstruction for Textureless and Specular Objects
Figure 2 for PISR: Polarimetric Neural Implicit Surface Reconstruction for Textureless and Specular Objects
Figure 3 for PISR: Polarimetric Neural Implicit Surface Reconstruction for Textureless and Specular Objects
Figure 4 for PISR: Polarimetric Neural Implicit Surface Reconstruction for Textureless and Specular Objects
Viaarxiv icon

Predicting User Stances from Target-Agnostic Information using Large Language Models

Add code
Sep 22, 2024
Figure 1 for Predicting User Stances from Target-Agnostic Information using Large Language Models
Figure 2 for Predicting User Stances from Target-Agnostic Information using Large Language Models
Figure 3 for Predicting User Stances from Target-Agnostic Information using Large Language Models
Figure 4 for Predicting User Stances from Target-Agnostic Information using Large Language Models
Viaarxiv icon

FLAF: Focal Line and Feature-constrained Active View Planning for Visual Teach and Repeat

Add code
Sep 05, 2024
Figure 1 for FLAF: Focal Line and Feature-constrained Active View Planning for Visual Teach and Repeat
Figure 2 for FLAF: Focal Line and Feature-constrained Active View Planning for Visual Teach and Repeat
Figure 3 for FLAF: Focal Line and Feature-constrained Active View Planning for Visual Teach and Repeat
Figure 4 for FLAF: Focal Line and Feature-constrained Active View Planning for Visual Teach and Repeat
Viaarxiv icon

Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and VisualAnalysis Strategy

Add code
Aug 22, 2024
Figure 1 for Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and VisualAnalysis Strategy
Figure 2 for Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and VisualAnalysis Strategy
Figure 3 for Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and VisualAnalysis Strategy
Figure 4 for Unlocking Attributes' Contribution to Successful Camouflage: A Combined Textual and VisualAnalysis Strategy
Viaarxiv icon

MuRAR: A Simple and Effective Multimodal Retrieval and Answer Refinement Framework for Multimodal Question Answering

Add code
Aug 16, 2024
Figure 1 for MuRAR: A Simple and Effective Multimodal Retrieval and Answer Refinement Framework for Multimodal Question Answering
Figure 2 for MuRAR: A Simple and Effective Multimodal Retrieval and Answer Refinement Framework for Multimodal Question Answering
Figure 3 for MuRAR: A Simple and Effective Multimodal Retrieval and Answer Refinement Framework for Multimodal Question Answering
Figure 4 for MuRAR: A Simple and Effective Multimodal Retrieval and Answer Refinement Framework for Multimodal Question Answering
Viaarxiv icon