Alert button
Picture for Sipeng Zheng

Sipeng Zheng

Alert button

UniCode: Learning a Unified Codebook for Multimodal Large Language Models

Add code
Bookmark button
Alert button
Mar 14, 2024
Sipeng Zheng, Bohan Zhou, Yicheng Feng, Ye Wang, Zongqing Lu

Figure 1 for UniCode: Learning a Unified Codebook for Multimodal Large Language Models
Figure 2 for UniCode: Learning a Unified Codebook for Multimodal Large Language Models
Figure 3 for UniCode: Learning a Unified Codebook for Multimodal Large Language Models
Figure 4 for UniCode: Learning a Unified Codebook for Multimodal Large Language Models
Viaarxiv icon

SPAFormer: Sequential 3D Part Assembly with Transformers

Add code
Bookmark button
Alert button
Mar 09, 2024
Boshen Xu, Sipeng Zheng, Qin Jin

Figure 1 for SPAFormer: Sequential 3D Part Assembly with Transformers
Figure 2 for SPAFormer: Sequential 3D Part Assembly with Transformers
Figure 3 for SPAFormer: Sequential 3D Part Assembly with Transformers
Figure 4 for SPAFormer: Sequential 3D Part Assembly with Transformers
Viaarxiv icon

POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-View World

Add code
Bookmark button
Alert button
Mar 09, 2024
Boshen Xu, Sipeng Zheng, Qin Jin

Figure 1 for POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-View World
Figure 2 for POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-View World
Figure 3 for POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-View World
Figure 4 for POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-View World
Viaarxiv icon

Steve-Eye: Equipping LLM-based Embodied Agents with Visual Perception in Open Worlds

Add code
Bookmark button
Alert button
Oct 20, 2023
Sipeng Zheng, Jiazheng Liu, Yicheng Feng, Zongqing Lu

Viaarxiv icon

LLaMA Rider: Spurring Large Language Models to Explore the Open World

Add code
Bookmark button
Alert button
Oct 13, 2023
Yicheng Feng, Yuxuan Wang, Jiazheng Liu, Sipeng Zheng, Zongqing Lu

Viaarxiv icon

No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection

Add code
Bookmark button
Alert button
Jul 20, 2023
Qi Zhang, Sipeng Zheng, Qin Jin

Figure 1 for No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection
Figure 2 for No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection
Figure 3 for No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection
Figure 4 for No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection
Viaarxiv icon

Accommodating Audio Modality in CLIP for Multimodal Processing

Add code
Bookmark button
Alert button
Mar 12, 2023
Ludan Ruan, Anwen Hu, Yuqing Song, Liang Zhang, Sipeng Zheng, Qin Jin

Figure 1 for Accommodating Audio Modality in CLIP for Multimodal Processing
Figure 2 for Accommodating Audio Modality in CLIP for Multimodal Processing
Figure 3 for Accommodating Audio Modality in CLIP for Multimodal Processing
Figure 4 for Accommodating Audio Modality in CLIP for Multimodal Processing
Viaarxiv icon

Exploring Anchor-based Detection for Ego4D Natural Language Query

Add code
Bookmark button
Alert button
Aug 10, 2022
Sipeng Zheng, Qi Zhang, Bei Liu, Qin Jin, Jianlong Fu

Figure 1 for Exploring Anchor-based Detection for Ego4D Natural Language Query
Figure 2 for Exploring Anchor-based Detection for Ego4D Natural Language Query
Figure 3 for Exploring Anchor-based Detection for Ego4D Natural Language Query
Figure 4 for Exploring Anchor-based Detection for Ego4D Natural Language Query
Viaarxiv icon