Picture for Guo Chen

Guo Chen

EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation

Add code
Jun 27, 2024
Figure 1 for EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation
Figure 2 for EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation
Figure 3 for EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation
Figure 4 for EgoVideo: Exploring Egocentric Foundation Model and Downstream Adaptation
Viaarxiv icon

RH-SQL: Refined Schema and Hardness Prompt for Text-to-SQL

Add code
Jun 13, 2024
Viaarxiv icon

SPMamba: State-space model is all you need in speech separation

Add code
Apr 02, 2024
Viaarxiv icon

EgoExoLearn: A Dataset for Bridging Asynchronous Ego- and Exo-centric View of Procedural Activities in Real World

Add code
Mar 24, 2024
Viaarxiv icon

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

Add code
Mar 22, 2024
Viaarxiv icon

Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding

Add code
Mar 14, 2024
Figure 1 for Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Figure 2 for Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Figure 3 for Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Figure 4 for Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding
Viaarxiv icon

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

Add code
Jan 15, 2024
Figure 1 for InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Figure 2 for InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Figure 3 for InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Figure 4 for InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks
Viaarxiv icon

Retrieval-Augmented Egocentric Video Captioning

Add code
Jan 03, 2024
Viaarxiv icon

Decoupling SQL Query Hardness Parsing for Text-to-SQL

Add code
Dec 29, 2023
Figure 1 for Decoupling SQL Query Hardness Parsing for Text-to-SQL
Figure 2 for Decoupling SQL Query Hardness Parsing for Text-to-SQL
Figure 3 for Decoupling SQL Query Hardness Parsing for Text-to-SQL
Figure 4 for Decoupling SQL Query Hardness Parsing for Text-to-SQL
Viaarxiv icon

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark

Add code
Dec 03, 2023
Figure 1 for MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Figure 2 for MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Figure 3 for MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Figure 4 for MVBench: A Comprehensive Multi-modal Video Understanding Benchmark
Viaarxiv icon