Alert button

"speech": models, code, and papers
Alert button

Security and Privacy Problems in Voice Assistant Applications: A Survey

Apr 19, 2023
Jingjin Li, Chao chen, Lei Pan, Mostafa Rahimi Azghadi, Hossein Ghodosi, Jun Zhang

Figure 1 for Security and Privacy Problems in Voice Assistant Applications: A Survey
Figure 2 for Security and Privacy Problems in Voice Assistant Applications: A Survey
Figure 3 for Security and Privacy Problems in Voice Assistant Applications: A Survey
Figure 4 for Security and Privacy Problems in Voice Assistant Applications: A Survey
Viaarxiv icon

Optimizing Deep Learning Models For Raspberry Pi

Apr 25, 2023
Salem Ameen, Kangaranmulle Siriwardana, Theo Theodoridis

Figure 1 for Optimizing Deep Learning Models For Raspberry Pi
Figure 2 for Optimizing Deep Learning Models For Raspberry Pi
Figure 3 for Optimizing Deep Learning Models For Raspberry Pi
Figure 4 for Optimizing Deep Learning Models For Raspberry Pi
Viaarxiv icon

QuickVC: Many-to-any Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion

Add code
Bookmark button
Alert button
Feb 20, 2023
Houjian Guo, Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro

Figure 1 for QuickVC: Many-to-any Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Figure 2 for QuickVC: Many-to-any Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Figure 3 for QuickVC: Many-to-any Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Figure 4 for QuickVC: Many-to-any Voice Conversion Using Inverse Short-time Fourier Transform for Faster Conversion
Viaarxiv icon

Real-Time Joint Personalized Speech Enhancement and Acoustic Echo Cancellation with E3Net

Nov 04, 2022
Sefik Emre Eskimez, Takuya Yoshioka, Alex Ju, Min Tang, Tanel Parnamaa, Huaming Wang

Figure 1 for Real-Time Joint Personalized Speech Enhancement and Acoustic Echo Cancellation with E3Net
Figure 2 for Real-Time Joint Personalized Speech Enhancement and Acoustic Echo Cancellation with E3Net
Figure 3 for Real-Time Joint Personalized Speech Enhancement and Acoustic Echo Cancellation with E3Net
Viaarxiv icon

Turn-Taking Prediction for Natural Conversational Speech

Aug 29, 2022
Shuo-yiin Chang, Bo Li, Tara N. Sainath, Chao Zhang, Trevor Strohman, Qiao Liang, Yanzhang He

Figure 1 for Turn-Taking Prediction for Natural Conversational Speech
Figure 2 for Turn-Taking Prediction for Natural Conversational Speech
Figure 3 for Turn-Taking Prediction for Natural Conversational Speech
Figure 4 for Turn-Taking Prediction for Natural Conversational Speech
Viaarxiv icon

Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization

Add code
Bookmark button
Alert button
Sep 28, 2022
Xiao-Ying Zhao, Qiu-Shi Zhu, Jie Zhang

Figure 1 for Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization
Figure 2 for Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization
Figure 3 for Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization
Figure 4 for Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization
Viaarxiv icon

Text with Knowledge Graph Augmented Transformer for Video Captioning

Add code
Bookmark button
Alert button
Mar 25, 2023
Xin Gu, Guang Chen, Yufei Wang, Libo Zhang, Tiejian Luo, Longyin Wen

Figure 1 for Text with Knowledge Graph Augmented Transformer for Video Captioning
Figure 2 for Text with Knowledge Graph Augmented Transformer for Video Captioning
Figure 3 for Text with Knowledge Graph Augmented Transformer for Video Captioning
Figure 4 for Text with Knowledge Graph Augmented Transformer for Video Captioning
Viaarxiv icon

Feature Selection Enhancement and Feature Space Visualization for Speech-Based Emotion Recognition

Aug 19, 2022
Sofia Kanwal, Sohail Asghar, Hazrat Ali

Figure 1 for Feature Selection Enhancement and Feature Space Visualization for Speech-Based Emotion Recognition
Figure 2 for Feature Selection Enhancement and Feature Space Visualization for Speech-Based Emotion Recognition
Figure 3 for Feature Selection Enhancement and Feature Space Visualization for Speech-Based Emotion Recognition
Figure 4 for Feature Selection Enhancement and Feature Space Visualization for Speech-Based Emotion Recognition
Viaarxiv icon

YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual grounding

Add code
Bookmark button
Alert button
Oct 12, 2022
Kayode Olaleye, Dan Oneata, Herman Kamper

Figure 1 for YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual grounding
Figure 2 for YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual grounding
Figure 3 for YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual grounding
Figure 4 for YFACC: A Yorùbá speech-image dataset for cross-lingual keyword localisation through visual grounding
Viaarxiv icon

Integrity and Junkiness Failure Handling for Embedding-based Retrieval: A Case Study in Social Network Search

Apr 18, 2023
Wenping Wang, Yunxi Guo, Chiyao Shen, Shuai Ding, Guangdeng Liao, Hao Fu, Pramodh Karanth Prabhakar

Figure 1 for Integrity and Junkiness Failure Handling for Embedding-based Retrieval: A Case Study in Social Network Search
Figure 2 for Integrity and Junkiness Failure Handling for Embedding-based Retrieval: A Case Study in Social Network Search
Figure 3 for Integrity and Junkiness Failure Handling for Embedding-based Retrieval: A Case Study in Social Network Search
Viaarxiv icon