Picture for Shentong Mo

Shentong Mo

IoT-LM: Large Multisensory Language Models for the Internet of Things

Add code
Jul 13, 2024
Figure 1 for IoT-LM: Large Multisensory Language Models for the Internet of Things
Figure 2 for IoT-LM: Large Multisensory Language Models for the Internet of Things
Figure 3 for IoT-LM: Large Multisensory Language Models for the Internet of Things
Figure 4 for IoT-LM: Large Multisensory Language Models for the Internet of Things
Viaarxiv icon

Semantic Grouping Network for Audio Source Separation

Add code
Jul 04, 2024
Figure 1 for Semantic Grouping Network for Audio Source Separation
Figure 2 for Semantic Grouping Network for Audio Source Separation
Figure 3 for Semantic Grouping Network for Audio Source Separation
Figure 4 for Semantic Grouping Network for Audio Source Separation
Viaarxiv icon

MA-AVT: Modality Alignment for Parameter-Efficient Audio-Visual Transformers

Add code
Jun 07, 2024
Viaarxiv icon

Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMs

Add code
Jun 07, 2024
Figure 1 for Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMs
Figure 2 for Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMs
Figure 3 for Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMs
Figure 4 for Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMs
Viaarxiv icon

DMT-JEPA: Discriminative Masked Targets for Joint-Embedding Predictive Architecture

Add code
May 28, 2024
Viaarxiv icon

Scaling Diffusion Mamba with Bidirectional SSMs for Efficient Image and Video Generation

Add code
May 24, 2024
Viaarxiv icon

Unified Video-Language Pre-training with Synchronized Audio

Add code
May 12, 2024
Viaarxiv icon

A Large-scale Medical Visual Task Adaptation Benchmark

Add code
Apr 19, 2024
Figure 1 for A Large-scale Medical Visual Task Adaptation Benchmark
Figure 2 for A Large-scale Medical Visual Task Adaptation Benchmark
Figure 3 for A Large-scale Medical Visual Task Adaptation Benchmark
Figure 4 for A Large-scale Medical Visual Task Adaptation Benchmark
Viaarxiv icon

DailyMAE: Towards Pretraining Masked Autoencoders in One Day

Add code
Mar 31, 2024
Figure 1 for DailyMAE: Towards Pretraining Masked Autoencoders in One Day
Figure 2 for DailyMAE: Towards Pretraining Masked Autoencoders in One Day
Figure 3 for DailyMAE: Towards Pretraining Masked Autoencoders in One Day
Figure 4 for DailyMAE: Towards Pretraining Masked Autoencoders in One Day
Viaarxiv icon

Audio-Synchronized Visual Animation

Add code
Mar 08, 2024
Figure 1 for Audio-Synchronized Visual Animation
Figure 2 for Audio-Synchronized Visual Animation
Figure 3 for Audio-Synchronized Visual Animation
Figure 4 for Audio-Synchronized Visual Animation
Viaarxiv icon