Picture for Long Chen

Long Chen

University of Kaiserslautern-Landau, MODE Collaboration

SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos

Add code
Mar 03, 2024
Figure 1 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Figure 2 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Figure 3 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Figure 4 for SCHEMA: State CHangEs MAtter for Procedure Planning in Instructional Videos
Viaarxiv icon

GenAD: Generative End-to-End Autonomous Driving

Add code
Feb 20, 2024
Figure 1 for GenAD: Generative End-to-End Autonomous Driving
Figure 2 for GenAD: Generative End-to-End Autonomous Driving
Figure 3 for GenAD: Generative End-to-End Autonomous Driving
Figure 4 for GenAD: Generative End-to-End Autonomous Driving
Viaarxiv icon

Improving Data Augmentation for Robust Visual Question Answering with Effective Curriculum Learning

Add code
Jan 28, 2024
Viaarxiv icon

Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion

Add code
Jan 26, 2024
Figure 1 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Figure 2 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Figure 3 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Figure 4 for Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion
Viaarxiv icon

Boundary and Relation Distillation for Semantic Segmentation

Add code
Jan 24, 2024
Figure 1 for Boundary and Relation Distillation for Semantic Segmentation
Figure 2 for Boundary and Relation Distillation for Semantic Segmentation
Figure 3 for Boundary and Relation Distillation for Semantic Segmentation
Figure 4 for Boundary and Relation Distillation for Semantic Segmentation
Viaarxiv icon

Two-pass Endpoint Detection for Speech Recognition

Add code
Jan 17, 2024
Viaarxiv icon

Multiperson Detection and Vital-Sign Sensing Empowered by Space-Time-Coding RISs

Add code
Jan 15, 2024
Viaarxiv icon

SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network

Add code
Dec 26, 2023
Figure 1 for SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network
Figure 2 for SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network
Figure 3 for SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network
Figure 4 for SoundCount: Sound Counting from Raw Audio with Dyadic Decomposition Neural Network
Viaarxiv icon

LingoQA: Video Question Answering for Autonomous Driving

Add code
Dec 21, 2023
Viaarxiv icon

Beneath the Surface: Unveiling Harmful Memes with Multimodal Reasoning Distilled from Large Language Models

Add code
Dec 09, 2023
Viaarxiv icon