Picture for Yu Wu

Yu Wu

Wuhan University

Accelerating Transducers through Adjacent Token Merging

Add code
Jun 28, 2023
Figure 1 for Accelerating Transducers through Adjacent Token Merging
Figure 2 for Accelerating Transducers through Adjacent Token Merging
Figure 3 for Accelerating Transducers through Adjacent Token Merging
Figure 4 for Accelerating Transducers through Adjacent Token Merging
Viaarxiv icon

Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition

Add code
Jun 28, 2023
Viaarxiv icon

Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation

Add code
Jun 17, 2023
Figure 1 for Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
Figure 2 for Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
Figure 3 for Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
Figure 4 for Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
Viaarxiv icon

Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective

Add code
Jun 14, 2023
Figure 1 for Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective
Figure 2 for Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective
Figure 3 for Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective
Figure 4 for Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective
Viaarxiv icon

1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation

Add code
Jun 08, 2023
Figure 1 for 1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation
Figure 2 for 1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation
Figure 3 for 1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation
Figure 4 for 1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation
Viaarxiv icon

DVIS: Decoupled Video Instance Segmentation Framework

Add code
Jun 08, 2023
Viaarxiv icon

Accurate and Structured Pruning for Efficient Automatic Speech Recognition

Add code
May 31, 2023
Figure 1 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Figure 2 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Figure 3 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Figure 4 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Viaarxiv icon

VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation

Add code
May 25, 2023
Figure 1 for VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Figure 2 for VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Figure 3 for VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Figure 4 for VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Viaarxiv icon

Click-Feedback Retrieval

Add code
Apr 28, 2023
Viaarxiv icon

Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World

Add code
Mar 23, 2023
Figure 1 for Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World
Figure 2 for Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World
Figure 3 for Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World
Figure 4 for Visually-Prompted Language Model for Fine-Grained Scene Graph Generation in an Open World
Viaarxiv icon