Picture for Shengyi Qian

Shengyi Qian

Multi-Object Hallucination in Vision-Language Models

Add code
Jul 08, 2024
Viaarxiv icon

3D-MVP: 3D Multiview Pretraining for Robotic Manipulation

Add code
Jun 26, 2024
Viaarxiv icon

Multimodal Graph Benchmark

Add code
Jun 24, 2024
Figure 1 for Multimodal Graph Benchmark
Figure 2 for Multimodal Graph Benchmark
Figure 3 for Multimodal Graph Benchmark
Figure 4 for Multimodal Graph Benchmark
Viaarxiv icon

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

Add code
Jun 12, 2024
Viaarxiv icon

3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs

Add code
Jun 07, 2024
Viaarxiv icon

LinkGPT: Teaching Large Language Models To Predict Missing Links

Add code
Jun 07, 2024
Viaarxiv icon

AffordanceLLM: Grounding Affordance from Vision Language Models

Add code
Jan 12, 2024
Viaarxiv icon

LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent

Add code
Sep 21, 2023
Viaarxiv icon

SpotTarget: Rethinking the Effect of Target Edges for Link Prediction in Graph Neural Networks

Add code
Jun 01, 2023
Viaarxiv icon

Understanding 3D Object Interaction from a Single Image

Add code
May 16, 2023
Viaarxiv icon