Alert button
Picture for Yu Wu

Yu Wu

Alert button

Wuhan University

Accelerating Transducers through Adjacent Token Merging

Add code
Bookmark button
Alert button
Jun 28, 2023
Yuang Li, Yu Wu, Jinyu Li, Shujie Liu

Figure 1 for Accelerating Transducers through Adjacent Token Merging
Figure 2 for Accelerating Transducers through Adjacent Token Merging
Figure 3 for Accelerating Transducers through Adjacent Token Merging
Figure 4 for Accelerating Transducers through Adjacent Token Merging
Viaarxiv icon

Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition

Add code
Bookmark button
Alert button
Jun 28, 2023
Yuang Li, Yu Wu, Jinyu Li, Shujie Liu

Figure 1 for Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition
Figure 2 for Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition
Figure 3 for Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition
Figure 4 for Prompting Large Language Models for Zero-Shot Domain Adaptation in Speech Recognition
Viaarxiv icon

Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation

Add code
Bookmark button
Alert button
Jun 17, 2023
Yongqi Yang, Ruoyu Wang, Zhihao Qian, Ye Zhu, Yu Wu

Figure 1 for Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
Figure 2 for Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
Figure 3 for Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
Figure 4 for Diffusion in Diffusion: Cyclic One-Way Diffusion for Text-Vision-Conditioned Generation
Viaarxiv icon

Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective

Add code
Bookmark button
Alert button
Jun 14, 2023
Yingying Fan, Yu Wu, Yutian Lin, Bo Du

Figure 1 for Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective
Figure 2 for Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective
Figure 3 for Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective
Figure 4 for Revisit Weakly-Supervised Audio-Visual Video Parsing from the Language Perspective
Viaarxiv icon

1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation

Add code
Bookmark button
Alert button
Jun 08, 2023
Tao Zhang, Xingye Tian, Haoran Wei, Yu Wu, Shunping Ji, Xuebo Wang, Xin Tao, Yuan Zhang, Pengfei Wan

Figure 1 for 1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation
Figure 2 for 1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation
Figure 3 for 1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation
Figure 4 for 1st Place Solution for PVUW Challenge 2023: Video Panoptic Segmentation
Viaarxiv icon

DVIS: Decoupled Video Instance Segmentation Framework

Add code
Bookmark button
Alert button
Jun 08, 2023
Tao Zhang, Xingye Tian, Yu Wu, Shunping Ji, Xuebo Wang, Yuan Zhang, Pengfei Wan

Figure 1 for DVIS: Decoupled Video Instance Segmentation Framework
Figure 2 for DVIS: Decoupled Video Instance Segmentation Framework
Figure 3 for DVIS: Decoupled Video Instance Segmentation Framework
Figure 4 for DVIS: Decoupled Video Instance Segmentation Framework
Viaarxiv icon

Accurate and Structured Pruning for Efficient Automatic Speech Recognition

Add code
Bookmark button
Alert button
May 31, 2023
Huiqiang Jiang, Li Lyna Zhang, Yuang Li, Yu Wu, Shijie Cao, Ting Cao, Yuqing Yang, Jinyu Li, Mao Yang, Lili Qiu

Figure 1 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Figure 2 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Figure 3 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Figure 4 for Accurate and Structured Pruning for Efficient Automatic Speech Recognition
Viaarxiv icon

VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation

Add code
Bookmark button
Alert button
May 25, 2023
Tianrui Wang, Long Zhou, Ziqiang Zhang, Yu Wu, Shujie Liu, Yashesh Gaur, Zhuo Chen, Jinyu Li, Furu Wei

Figure 1 for VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Figure 2 for VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Figure 3 for VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Figure 4 for VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation
Viaarxiv icon