Picture for Wengang Zhou

Wengang Zhou

P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task

Add code
Sep 17, 2024
Figure 1 for P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task
Figure 2 for P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task
Figure 3 for P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task
Figure 4 for P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task
Viaarxiv icon

AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding

Add code
Aug 30, 2024
Figure 1 for AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding
Figure 2 for AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding
Figure 3 for AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding
Figure 4 for AdaptVision: Dynamic Input Scaling in MLLMs for Versatile Scene Understanding
Viaarxiv icon

LaneTCA: Enhancing Video Lane Detection with Temporal Context Aggregation

Add code
Aug 25, 2024
Viaarxiv icon

Scaling up Multimodal Pre-training for Sign Language Understanding

Add code
Aug 16, 2024
Figure 1 for Scaling up Multimodal Pre-training for Sign Language Understanding
Figure 2 for Scaling up Multimodal Pre-training for Sign Language Understanding
Figure 3 for Scaling up Multimodal Pre-training for Sign Language Understanding
Figure 4 for Scaling up Multimodal Pre-training for Sign Language Understanding
Viaarxiv icon

SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection

Add code
Aug 07, 2024
Figure 1 for SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection
Figure 2 for SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection
Figure 3 for SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection
Figure 4 for SwinShadow: Shifted Window for Ambiguous Adjacent Shadow Detection
Viaarxiv icon

SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language Retrieval

Add code
Jul 23, 2024
Figure 1 for SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language Retrieval
Figure 2 for SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language Retrieval
Figure 3 for SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language Retrieval
Figure 4 for SEDS: Semantically Enhanced Dual-Stream Encoder for Sign Language Retrieval
Viaarxiv icon

Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis

Add code
Jul 07, 2024
Figure 1 for Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis
Figure 2 for Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis
Figure 3 for Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis
Figure 4 for Forest2Seq: Revitalizing Order Prior for Sequential Indoor Scene Synthesis
Viaarxiv icon

RoFIR: Robust Fisheye Image Rectification Framework Impervious to Optical Center Deviation

Add code
Jun 27, 2024
Figure 1 for RoFIR: Robust Fisheye Image Rectification Framework Impervious to Optical Center Deviation
Figure 2 for RoFIR: Robust Fisheye Image Rectification Framework Impervious to Optical Center Deviation
Figure 3 for RoFIR: Robust Fisheye Image Rectification Framework Impervious to Optical Center Deviation
Figure 4 for RoFIR: Robust Fisheye Image Rectification Framework Impervious to Optical Center Deviation
Viaarxiv icon

Text-Animator: Controllable Visual Text Video Generation

Add code
Jun 25, 2024
Viaarxiv icon

Semi-Supervised Spoken Language Glossification

Add code
Jun 12, 2024
Viaarxiv icon