Picture for Yuma Koizumi

Yuma Koizumi

SNRi Target Training for Joint Speech Enhancement and Recognition

Add code
Nov 01, 2021
Figure 1 for SNRi Target Training for Joint Speech Enhancement and Recognition
Figure 2 for SNRi Target Training for Joint Speech Enhancement and Recognition
Figure 3 for SNRi Target Training for Joint Speech Enhancement and Recognition
Figure 4 for SNRi Target Training for Joint Speech Enhancement and Recognition
Viaarxiv icon

DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement

Add code
Jun 30, 2021
Figure 1 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Figure 2 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Figure 3 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Figure 4 for DF-Conformer: Integrated architecture of Conv-TasNet and Conformer using linear complexity self-attention for speech enhancement
Viaarxiv icon

Description and Discussion on DCASE 2021 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions

Add code
Jun 08, 2021
Figure 1 for Description and Discussion on DCASE 2021 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions
Figure 2 for Description and Discussion on DCASE 2021 Challenge Task 2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring under Domain Shifted Conditions
Viaarxiv icon

Sampling-Frequency-Independent Audio Source Separation Using Convolution Layer Based on Impulse Invariant Method

Add code
May 10, 2021
Figure 1 for Sampling-Frequency-Independent Audio Source Separation Using Convolution Layer Based on Impulse Invariant Method
Figure 2 for Sampling-Frequency-Independent Audio Source Separation Using Convolution Layer Based on Impulse Invariant Method
Figure 3 for Sampling-Frequency-Independent Audio Source Separation Using Convolution Layer Based on Impulse Invariant Method
Figure 4 for Sampling-Frequency-Independent Audio Source Separation Using Convolution Layer Based on Impulse Invariant Method
Viaarxiv icon

Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech

Add code
Jan 21, 2021
Figure 1 for Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech
Figure 2 for Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech
Figure 3 for Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech
Figure 4 for Noisy-target Training: A Training Strategy for DNN-based Speech Enhancement without Clean Speech
Viaarxiv icon

Audio Captioning using Pre-Trained Large-Scale Language Model Guided by Audio-based Similar Caption Retrieval

Add code
Dec 14, 2020
Figure 1 for Audio Captioning using Pre-Trained Large-Scale Language Model Guided by Audio-based Similar Caption Retrieval
Figure 2 for Audio Captioning using Pre-Trained Large-Scale Language Model Guided by Audio-based Similar Caption Retrieval
Figure 3 for Audio Captioning using Pre-Trained Large-Scale Language Model Guided by Audio-based Similar Caption Retrieval
Figure 4 for Audio Captioning using Pre-Trained Large-Scale Language Model Guided by Audio-based Similar Caption Retrieval
Viaarxiv icon

Effects of Word-frequency based Pre- and Post- Processings for Audio Captioning

Add code
Sep 24, 2020
Figure 1 for Effects of Word-frequency based Pre- and Post- Processings for Audio Captioning
Figure 2 for Effects of Word-frequency based Pre- and Post- Processings for Audio Captioning
Figure 3 for Effects of Word-frequency based Pre- and Post- Processings for Audio Captioning
Viaarxiv icon

The NTT DCASE2020 Challenge Task 6 system: Automated Audio Captioning with Keywords and Sentence Length Estimation

Add code
Jul 01, 2020
Figure 1 for The NTT DCASE2020 Challenge Task 6 system: Automated Audio Captioning with Keywords and Sentence Length Estimation
Figure 2 for The NTT DCASE2020 Challenge Task 6 system: Automated Audio Captioning with Keywords and Sentence Length Estimation
Figure 3 for The NTT DCASE2020 Challenge Task 6 system: Automated Audio Captioning with Keywords and Sentence Length Estimation
Viaarxiv icon

A Transformer-based Audio Captioning Model with Keyword Estimation

Add code
Jul 01, 2020
Figure 1 for A Transformer-based Audio Captioning Model with Keyword Estimation
Figure 2 for A Transformer-based Audio Captioning Model with Keyword Estimation
Figure 3 for A Transformer-based Audio Captioning Model with Keyword Estimation
Viaarxiv icon

Description and Discussion on DCASE2020 Challenge Task2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring

Add code
Jun 10, 2020
Figure 1 for Description and Discussion on DCASE2020 Challenge Task2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring
Viaarxiv icon