Alert button
Picture for Kentaro Mitsui

Kentaro Mitsui

Alert button

Release of Pre-Trained Models for the Japanese Language

Add code
Bookmark button
Alert button
Apr 02, 2024
Kei Sawada, Tianyu Zhao, Makoto Shing, Kentaro Mitsui, Akio Kaga, Yukiya Hono, Toshiaki Wakatsuki, Koh Mitsuda

Viaarxiv icon

An Integration of Pre-Trained Speech and Language Models for End-to-End Speech Recognition

Add code
Bookmark button
Alert button
Dec 06, 2023
Yukiya Hono, Koh Mitsuda, Tianyu Zhao, Kentaro Mitsui, Toshiaki Wakatsuki, Kei Sawada

Viaarxiv icon

Towards human-like spoken dialogue generation between AI agents from written dialogue

Add code
Bookmark button
Alert button
Oct 02, 2023
Kentaro Mitsui, Yukiya Hono, Kei Sawada

Viaarxiv icon

UniFLG: Unified Facial Landmark Generator from Text or Speech

Add code
Bookmark button
Alert button
Feb 28, 2023
Kentaro Mitsui, Yukiya Hono, Kei Sawada

Figure 1 for UniFLG: Unified Facial Landmark Generator from Text or Speech
Figure 2 for UniFLG: Unified Facial Landmark Generator from Text or Speech
Figure 3 for UniFLG: Unified Facial Landmark Generator from Text or Speech
Figure 4 for UniFLG: Unified Facial Landmark Generator from Text or Speech
Viaarxiv icon

Text-Guided Scene Sketch-to-Photo Synthesis

Add code
Bookmark button
Alert button
Feb 14, 2023
AprilPyone MaungMaung, Makoto Shing, Kentaro Mitsui, Kei Sawada, Fumio Okura

Figure 1 for Text-Guided Scene Sketch-to-Photo Synthesis
Figure 2 for Text-Guided Scene Sketch-to-Photo Synthesis
Figure 3 for Text-Guided Scene Sketch-to-Photo Synthesis
Figure 4 for Text-Guided Scene Sketch-to-Photo Synthesis
Viaarxiv icon

End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue

Add code
Bookmark button
Alert button
Jun 24, 2022
Kentaro Mitsui, Tianyu Zhao, Kei Sawada, Yukiya Hono, Yoshihiko Nankaku, Keiichi Tokuda

Figure 1 for End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue
Figure 2 for End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue
Figure 3 for End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue
Figure 4 for End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue
Viaarxiv icon

MSR-NV: Neural vocoder using multiple sampling rates

Add code
Bookmark button
Alert button
Sep 28, 2021
Kentaro Mitsui, Kei Sawada

Figure 1 for MSR-NV: Neural vocoder using multiple sampling rates
Figure 2 for MSR-NV: Neural vocoder using multiple sampling rates
Figure 3 for MSR-NV: Neural vocoder using multiple sampling rates
Figure 4 for MSR-NV: Neural vocoder using multiple sampling rates
Viaarxiv icon

Multi-speaker Text-to-speech Synthesis Using Deep Gaussian Processes

Add code
Bookmark button
Alert button
Aug 07, 2020
Kentaro Mitsui, Tomoki Koriyama, Hiroshi Saruwatari

Viaarxiv icon