Alert button

"speech": models, code, and papers
Alert button

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

Add code
Bookmark button
Alert button
Mar 22, 2024
Yi Wang, Kunchang Li, Xinhao Li, Jiashuo Yu, Yinan He, Guo Chen, Baoqi Pei, Rongkun Zheng, Jilan Xu, Zun Wang, Yansong Shi, Tianxiang Jiang, Songze Li, Hongjie Zhang, Yifei Huang, Yu Qiao, Yali Wang, Limin Wang

Viaarxiv icon

SICRN: Advancing Speech Enhancement through State Space Model and Inplace Convolution Techniques

Feb 22, 2024
Changjiang Zhao, Shulin He, Xueliang Zhang

Viaarxiv icon

FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer

Add code
Bookmark button
Alert button
Mar 21, 2024
Dongyeong Hwang, Hyunju Kim, Sunwoo Kim, Kijung Shin

Figure 1 for FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer
Figure 2 for FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer
Figure 3 for FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer
Figure 4 for FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer
Viaarxiv icon

Dialogue Understandability: Why are we streaming movies with subtitles?

Mar 22, 2024
Helard Becerra Martinez, Alessandro Ragano, Diptasree Debnath, Asad Ullah, Crisron Rudolf Lucas, Martin Walsh, Andrew Hines

Viaarxiv icon

SynerMix: Synergistic Mixup Solution for Enhanced Intra-Class Cohesion and Inter-Class Separability in Image Classification

Mar 24, 2024
Ye Xu, Ya Gao, Xiaorong Qiu, Yang Chen, Ying Ji

Figure 1 for SynerMix: Synergistic Mixup Solution for Enhanced Intra-Class Cohesion and Inter-Class Separability in Image Classification
Figure 2 for SynerMix: Synergistic Mixup Solution for Enhanced Intra-Class Cohesion and Inter-Class Separability in Image Classification
Figure 3 for SynerMix: Synergistic Mixup Solution for Enhanced Intra-Class Cohesion and Inter-Class Separability in Image Classification
Figure 4 for SynerMix: Synergistic Mixup Solution for Enhanced Intra-Class Cohesion and Inter-Class Separability in Image Classification
Viaarxiv icon

What do neural networks listen to? Exploring the crucial bands in Speech Enhancement using Sinc-convolution

Add code
Bookmark button
Alert button
Mar 04, 2024
Kuan-Hsun Ho, Jeih-weih Hung, Berlin Chen

Figure 1 for What do neural networks listen to? Exploring the crucial bands in Speech Enhancement using Sinc-convolution
Figure 2 for What do neural networks listen to? Exploring the crucial bands in Speech Enhancement using Sinc-convolution
Figure 3 for What do neural networks listen to? Exploring the crucial bands in Speech Enhancement using Sinc-convolution
Figure 4 for What do neural networks listen to? Exploring the crucial bands in Speech Enhancement using Sinc-convolution
Viaarxiv icon

How do Hyenas deal with Human Speech? Speech Recognition and Translation with ConfHyena

Feb 20, 2024
Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli

Viaarxiv icon

Open Access NAO (OAN): a ROS2-based software framework for HRI applications with the NAO robot

Mar 20, 2024
Antonio Bono, Kenji Brameld, Luigi D'Alfonso, Giuseppe Fedele

Figure 1 for Open Access NAO (OAN): a ROS2-based software framework for HRI applications with the NAO robot
Figure 2 for Open Access NAO (OAN): a ROS2-based software framework for HRI applications with the NAO robot
Figure 3 for Open Access NAO (OAN): a ROS2-based software framework for HRI applications with the NAO robot
Viaarxiv icon

AnnoTheia: A Semi-Automatic Annotation Toolkit for Audio-Visual Speech Technologies

Feb 20, 2024
José-M. Acosta-Triana, David Gimeno-Gómez, Carlos-D. Martínez-Hinarejos

Viaarxiv icon

Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?

Feb 19, 2024
Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli

Viaarxiv icon