Picture for Piotr Żelasko

Piotr Żelasko

BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5

Add code
Jun 28, 2024
Viaarxiv icon

Less is More: Accurate Speech Recognition & Translation without Web-Scale Data

Add code
Jun 28, 2024
Viaarxiv icon

Regularizing Contrastive Predictive Coding for Speech Applications

Add code
Apr 26, 2023
Figure 1 for Regularizing Contrastive Predictive Coding for Speech Applications
Figure 2 for Regularizing Contrastive Predictive Coding for Speech Applications
Figure 3 for Regularizing Contrastive Predictive Coding for Speech Applications
Figure 4 for Regularizing Contrastive Predictive Coding for Speech Applications
Viaarxiv icon

Delay-penalized transducer for low-latency streaming ASR

Add code
Oct 31, 2022
Figure 1 for Delay-penalized transducer for low-latency streaming ASR
Figure 2 for Delay-penalized transducer for low-latency streaming ASR
Figure 3 for Delay-penalized transducer for low-latency streaming ASR
Figure 4 for Delay-penalized transducer for low-latency streaming ASR
Viaarxiv icon

Fast and parallel decoding for transducer

Add code
Oct 31, 2022
Figure 1 for Fast and parallel decoding for transducer
Figure 2 for Fast and parallel decoding for transducer
Figure 3 for Fast and parallel decoding for transducer
Figure 4 for Fast and parallel decoding for transducer
Viaarxiv icon

Time-domain speech super-resolution with GAN based modeling for telephony speaker verification

Add code
Sep 04, 2022
Figure 1 for Time-domain speech super-resolution with GAN based modeling for telephony speaker verification
Figure 2 for Time-domain speech super-resolution with GAN based modeling for telephony speaker verification
Figure 3 for Time-domain speech super-resolution with GAN based modeling for telephony speaker verification
Figure 4 for Time-domain speech super-resolution with GAN based modeling for telephony speaker verification
Viaarxiv icon

Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations

Add code
Aug 10, 2022
Figure 1 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Figure 2 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Figure 3 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Figure 4 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Viaarxiv icon

Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition

Add code
Jan 28, 2022
Figure 1 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Figure 2 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Figure 3 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Figure 4 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Viaarxiv icon

Lhotse: a speech data representation library for the modern deep learning ecosystem

Add code
Oct 25, 2021
Figure 1 for Lhotse: a speech data representation library for the modern deep learning ecosystem
Figure 2 for Lhotse: a speech data representation library for the modern deep learning ecosystem
Viaarxiv icon

Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding

Add code
Oct 08, 2021
Figure 1 for Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding
Figure 2 for Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding
Figure 3 for Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding
Figure 4 for Unsupervised Speech Segmentation and Variable Rate Representation Learning using Segmental Contrastive Predictive Coding
Viaarxiv icon