Picture for Weipeng Deng

Weipeng Deng

NoteIt: A System Converting Instructional Videos to Interactable Notes Through Multimodal Video Understanding

Add code
Aug 20, 2025
Viaarxiv icon

Holistic Tokenizer for Autoregressive Image Generation

Add code
Jul 03, 2025
Viaarxiv icon

Multi-GraspLLM: A Multimodal LLM for Multi-Hand Semantic Guided Grasp Generation

Add code
Dec 11, 2024
Figure 1 for Multi-GraspLLM: A Multimodal LLM for Multi-Hand Semantic Guided Grasp Generation
Figure 2 for Multi-GraspLLM: A Multimodal LLM for Multi-Hand Semantic Guided Grasp Generation
Figure 3 for Multi-GraspLLM: A Multimodal LLM for Multi-Hand Semantic Guided Grasp Generation
Figure 4 for Multi-GraspLLM: A Multimodal LLM for Multi-Hand Semantic Guided Grasp Generation
Viaarxiv icon

Towards Edge General Intelligence via Large Language Models: Opportunities and Challenges

Add code
Oct 16, 2024
Figure 1 for Towards Edge General Intelligence via Large Language Models: Opportunities and Challenges
Figure 2 for Towards Edge General Intelligence via Large Language Models: Opportunities and Challenges
Figure 3 for Towards Edge General Intelligence via Large Language Models: Opportunities and Challenges
Figure 4 for Towards Edge General Intelligence via Large Language Models: Opportunities and Challenges
Viaarxiv icon

SegGrasp: Zero-Shot Task-Oriented Grasping via Semantic and Geometric Guided Segmentation

Add code
Oct 11, 2024
Viaarxiv icon

Can 3D Vision-Language Models Truly Understand Natural Language?

Add code
Mar 28, 2024
Figure 1 for Can 3D Vision-Language Models Truly Understand Natural Language?
Figure 2 for Can 3D Vision-Language Models Truly Understand Natural Language?
Figure 3 for Can 3D Vision-Language Models Truly Understand Natural Language?
Figure 4 for Can 3D Vision-Language Models Truly Understand Natural Language?
Viaarxiv icon

Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapolation

Add code
Mar 11, 2024
Viaarxiv icon