Picture for Lin-shan Lee

Lin-shan Lee

Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering

Add code
Apr 16, 2019
Figure 1 for Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering
Figure 2 for Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering
Figure 3 for Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering
Figure 4 for Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering
Viaarxiv icon

From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings

Add code
Apr 10, 2019
Figure 1 for From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Figure 2 for From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Figure 3 for From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Figure 4 for From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Viaarxiv icon

Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models

Add code
Apr 08, 2019
Figure 1 for Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Figure 2 for Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Figure 3 for Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Figure 4 for Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Viaarxiv icon

Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection

Add code
Nov 07, 2018
Figure 1 for Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Figure 2 for Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Figure 3 for Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Figure 4 for Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Viaarxiv icon

Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model

Add code
Nov 02, 2018
Figure 1 for Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model
Figure 2 for Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model
Figure 3 for Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model
Figure 4 for Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model
Viaarxiv icon

Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data

Add code
Oct 30, 2018
Figure 1 for Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Figure 2 for Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Figure 3 for Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Figure 4 for Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Viaarxiv icon

Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval

Add code
Sep 03, 2018
Figure 1 for Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Figure 2 for Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Figure 3 for Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Figure 4 for Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Viaarxiv icon

Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection

Add code
Aug 07, 2018
Figure 1 for Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection
Figure 2 for Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection
Figure 3 for Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection
Figure 4 for Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection
Viaarxiv icon

Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations

Add code
Jun 24, 2018
Figure 1 for Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations
Figure 2 for Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations
Figure 3 for Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations
Figure 4 for Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations
Viaarxiv icon

Transcribing Lyrics From Commercial Song Audio: The First Step Towards Singing Content Processing

Add code
Apr 15, 2018
Figure 1 for Transcribing Lyrics From Commercial Song Audio: The First Step Towards Singing Content Processing
Figure 2 for Transcribing Lyrics From Commercial Song Audio: The First Step Towards Singing Content Processing
Figure 3 for Transcribing Lyrics From Commercial Song Audio: The First Step Towards Singing Content Processing
Figure 4 for Transcribing Lyrics From Commercial Song Audio: The First Step Towards Singing Content Processing
Viaarxiv icon