Picture for Jan Cernocky

Jan Cernocky

Streaming Endpointer for Spoken Dialogue using Neural Audio Codecs and Label-Delayed Training

Add code
Jun 08, 2025
Viaarxiv icon

Fine-tune Before Structured Pruning: Towards Compact and Accurate Self-Supervised Models for Speaker Diarization

Add code
May 30, 2025
Viaarxiv icon

Target Speech Extraction with Pre-trained Self-supervised Learning Models

Add code
Feb 17, 2024
Viaarxiv icon

Probing Self-supervised Learning Models with Target Speech Extraction

Add code
Feb 17, 2024
Viaarxiv icon

DiaCorrect: Error Correction Back-end For Speaker Diarization

Add code
Sep 15, 2023
Viaarxiv icon

End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations

Add code
Aug 15, 2023
Figure 1 for End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations
Figure 2 for End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations
Figure 3 for End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations
Figure 4 for End-to-End Open Vocabulary Keyword Search With Multilingual Neural Representations
Viaarxiv icon

Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations

Add code
Oct 15, 2022
Figure 1 for Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations
Figure 2 for Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations
Figure 3 for Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations
Figure 4 for Extracting speaker and emotion information from self-supervised speech models via channel-wise correlations
Viaarxiv icon

An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification

Add code
Oct 03, 2022
Figure 1 for An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification
Figure 2 for An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification
Figure 3 for An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification
Figure 4 for An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification
Viaarxiv icon

DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation And Extraction

Add code
Dec 27, 2021
Figure 1 for DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation And Extraction
Figure 2 for DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation And Extraction
Figure 3 for DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation And Extraction
Figure 4 for DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation And Extraction
Viaarxiv icon

A Hierarchical Subspace Model for Language-Attuned Acoustic Unit Discovery

Add code
Nov 09, 2020
Figure 1 for A Hierarchical Subspace Model for Language-Attuned Acoustic Unit Discovery
Figure 2 for A Hierarchical Subspace Model for Language-Attuned Acoustic Unit Discovery
Viaarxiv icon