Alert button
Picture for Lin-shan Lee

Lin-shan Lee

Alert button

Interrupted and cascaded permutation invariant training for speech separation

Add code
Bookmark button
Alert button
Oct 28, 2019
Gene-Ping Yang, Szu-Lin Wu, Yao-Wen Mao, Hung-yi Lee, Lin-shan Lee

Figure 1 for Interrupted and cascaded permutation invariant training for speech separation
Figure 2 for Interrupted and cascaded permutation invariant training for speech separation
Figure 3 for Interrupted and cascaded permutation invariant training for speech separation
Figure 4 for Interrupted and cascaded permutation invariant training for speech separation
Viaarxiv icon

Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering

Add code
Bookmark button
Alert button
Apr 16, 2019
Gene-Ping Yang, Chao-I Tuan, Hung-Yi Lee, Lin-shan Lee

Figure 1 for Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering
Figure 2 for Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering
Figure 3 for Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering
Figure 4 for Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering
Viaarxiv icon

From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings

Add code
Bookmark button
Alert button
Apr 10, 2019
Yi-Chen Chen, Sung-Feng Huang, Hung-yi Lee, Lin-shan Lee

Figure 1 for From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Figure 2 for From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Figure 3 for From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Figure 4 for From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings
Viaarxiv icon

Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models

Add code
Bookmark button
Alert button
Apr 08, 2019
Kuan-Yu Chen, Che-Ping Tsai, Da-Rong Liu, Hung-Yi Lee, Lin-shan Lee

Figure 1 for Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Figure 2 for Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Figure 3 for Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Figure 4 for Completely Unsupervised Phoneme Recognition By A Generative Adversarial Network Harmonized With Iteratively Refined Hidden Markov Models
Viaarxiv icon

Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection

Add code
Bookmark button
Alert button
Nov 07, 2018
Sung-Feng Huang, Yi-Chen Chen, Hung-yi Lee, Lin-shan Lee

Figure 1 for Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Figure 2 for Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Figure 3 for Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Figure 4 for Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection
Viaarxiv icon

Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model

Add code
Bookmark button
Alert button
Nov 02, 2018
Alexander H. Liu, Hung-yi Lee, Lin-shan Lee

Figure 1 for Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model
Figure 2 for Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model
Figure 3 for Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model
Figure 4 for Adversarial Training of End-to-end Speech Recognition Using a Criticizing Language Model
Viaarxiv icon

Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data

Add code
Bookmark button
Alert button
Oct 30, 2018
Yi-Chen Chen, Chia-Hao Shen, Sung-Feng Huang, Hung-yi Lee, Lin-shan Lee

Figure 1 for Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Figure 2 for Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Figure 3 for Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Figure 4 for Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data
Viaarxiv icon

Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval

Add code
Bookmark button
Alert button
Sep 03, 2018
Yi-Chen Chen, Sung-Feng Huang, Chia-Hao Shen, Hung-yi Lee, Lin-shan Lee

Figure 1 for Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Figure 2 for Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Figure 3 for Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Figure 4 for Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval
Viaarxiv icon

Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection

Add code
Bookmark button
Alert button
Aug 07, 2018
Yu-Hsuan Wang, Hung-yi Lee, Lin-shan Lee

Figure 1 for Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection
Figure 2 for Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection
Figure 3 for Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection
Figure 4 for Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection
Viaarxiv icon

Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations

Add code
Bookmark button
Alert button
Jun 24, 2018
Ju-chieh Chou, Cheng-chieh Yeh, Hung-yi Lee, Lin-shan Lee

Figure 1 for Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations
Figure 2 for Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations
Figure 3 for Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations
Figure 4 for Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations
Viaarxiv icon