Picture for Satoru Fukayama

Satoru Fukayama

The CMU-AIST submission for the ICME 2025 Audio Encoder Challenge

Add code
Jan 22, 2026
Viaarxiv icon

OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder

Add code
Jul 18, 2025
Figure 1 for OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder
Figure 2 for OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder
Figure 3 for OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder
Figure 4 for OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder
Viaarxiv icon

Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora

Add code
Jul 02, 2025
Figure 1 for Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora
Figure 2 for Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora
Figure 3 for Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora
Figure 4 for Voice Conversion for Likability Control via Automated Rating of Speech Synthesis Corpora
Viaarxiv icon

IdolSongsJp Corpus: A Multi-Singer Song Corpus in the Style of Japanese Idol Groups

Add code
Jul 02, 2025
Viaarxiv icon

Prosodically Enhanced Foreign Accent Simulation by Discrete Token-based Resynthesis Only with Native Speech Corpora

Add code
May 22, 2025
Viaarxiv icon

Discrete Tokens Exhibit Interlanguage Speech Intelligibility Benefit: an Analytical Study Towards Accent-robust ASR Only with Native Speech Data

Add code
May 22, 2025
Figure 1 for Discrete Tokens Exhibit Interlanguage Speech Intelligibility Benefit: an Analytical Study Towards Accent-robust ASR Only with Native Speech Data
Figure 2 for Discrete Tokens Exhibit Interlanguage Speech Intelligibility Benefit: an Analytical Study Towards Accent-robust ASR Only with Native Speech Data
Figure 3 for Discrete Tokens Exhibit Interlanguage Speech Intelligibility Benefit: an Analytical Study Towards Accent-robust ASR Only with Native Speech Data
Figure 4 for Discrete Tokens Exhibit Interlanguage Speech Intelligibility Benefit: an Analytical Study Towards Accent-robust ASR Only with Native Speech Data
Viaarxiv icon

Discrete Speech Unit Extraction via Independent Component Analysis

Add code
Jan 11, 2025
Figure 1 for Discrete Speech Unit Extraction via Independent Component Analysis
Figure 2 for Discrete Speech Unit Extraction via Independent Component Analysis
Figure 3 for Discrete Speech Unit Extraction via Independent Component Analysis
Figure 4 for Discrete Speech Unit Extraction via Independent Component Analysis
Viaarxiv icon

Self-Supervised Speech Representations are More Phonetic than Semantic

Add code
Jun 12, 2024
Figure 1 for Self-Supervised Speech Representations are More Phonetic than Semantic
Figure 2 for Self-Supervised Speech Representations are More Phonetic than Semantic
Figure 3 for Self-Supervised Speech Representations are More Phonetic than Semantic
Figure 4 for Self-Supervised Speech Representations are More Phonetic than Semantic
Viaarxiv icon

jaCappella Corpus: A Japanese a Cappella Vocal Ensemble Corpus

Add code
Dec 09, 2022
Figure 1 for jaCappella Corpus: A Japanese a Cappella Vocal Ensemble Corpus
Figure 2 for jaCappella Corpus: A Japanese a Cappella Vocal Ensemble Corpus
Figure 3 for jaCappella Corpus: A Japanese a Cappella Vocal Ensemble Corpus
Figure 4 for jaCappella Corpus: A Japanese a Cappella Vocal Ensemble Corpus
Viaarxiv icon

Hyperbolic Timbre Embedding for Musical Instrument Sound Synthesis Based on Variational Autoencoders

Add code
Sep 27, 2022
Figure 1 for Hyperbolic Timbre Embedding for Musical Instrument Sound Synthesis Based on Variational Autoencoders
Figure 2 for Hyperbolic Timbre Embedding for Musical Instrument Sound Synthesis Based on Variational Autoencoders
Figure 3 for Hyperbolic Timbre Embedding for Musical Instrument Sound Synthesis Based on Variational Autoencoders
Figure 4 for Hyperbolic Timbre Embedding for Musical Instrument Sound Synthesis Based on Variational Autoencoders
Viaarxiv icon