Alert button
Picture for Joanna Hong

Joanna Hong

Alert button

Intuitive Multilingual Audio-Visual Speech Recognition with a Single-Trained Model

Add code
Bookmark button
Alert button
Oct 23, 2023
Joanna Hong, Se Jin Park, Yong Man Ro

Figure 1 for Intuitive Multilingual Audio-Visual Speech Recognition with a Single-Trained Model
Figure 2 for Intuitive Multilingual Audio-Visual Speech Recognition with a Single-Trained Model
Viaarxiv icon

DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding

Add code
Bookmark button
Alert button
Aug 15, 2023
Jeongsoo Choi, Joanna Hong, Yong Man Ro

Figure 1 for DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding
Figure 2 for DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding
Figure 3 for DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding
Figure 4 for DiffV2S: Diffusion-based Video-to-Speech Synthesis with Vision-guided Speaker Embedding
Viaarxiv icon

Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring

Add code
Bookmark button
Alert button
Mar 20, 2023
Joanna Hong, Minsu Kim, Jeongsoo Choi, Yong Man Ro

Figure 1 for Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
Figure 2 for Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
Figure 3 for Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
Figure 4 for Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring
Viaarxiv icon

Lip-to-Speech Synthesis in the Wild with Multi-task Learning

Add code
Bookmark button
Alert button
Feb 17, 2023
Minsu Kim, Joanna Hong, Yong Man Ro

Figure 1 for Lip-to-Speech Synthesis in the Wild with Multi-task Learning
Figure 2 for Lip-to-Speech Synthesis in the Wild with Multi-task Learning
Figure 3 for Lip-to-Speech Synthesis in the Wild with Multi-task Learning
Figure 4 for Lip-to-Speech Synthesis in the Wild with Multi-task Learning
Viaarxiv icon

SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory

Add code
Bookmark button
Alert button
Nov 03, 2022
Se Jin Park, Minsu Kim, Joanna Hong, Jeongsoo Choi, Yong Man Ro

Figure 1 for SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory
Figure 2 for SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory
Figure 3 for SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory
Figure 4 for SyncTalkFace: Talking Face Generation with Precise Lip-Syncing via Audio-Lip Memory
Viaarxiv icon

Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech Recognition

Add code
Bookmark button
Alert button
Jul 13, 2022
Joanna Hong, Minsu Kim, Daehun Yoo, Yong Man Ro

Figure 1 for Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech Recognition
Figure 2 for Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech Recognition
Figure 3 for Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech Recognition
Figure 4 for Visual Context-driven Audio Feature Enhancement for Robust End-to-End Audio-Visual Speech Recognition
Viaarxiv icon

VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection

Add code
Bookmark button
Alert button
Jun 15, 2022
Joanna Hong, Minsu Kim, Yong Man Ro

Figure 1 for VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection
Figure 2 for VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection
Figure 3 for VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection
Figure 4 for VisageSynTalk: Unseen Speaker Video-to-Speech Synthesis via Speech-Visage Feature Selection
Viaarxiv icon

Lip to Speech Synthesis with Visual Context Attentional GAN

Add code
Bookmark button
Alert button
Apr 04, 2022
Minsu Kim, Joanna Hong, Yong Man Ro

Figure 1 for Lip to Speech Synthesis with Visual Context Attentional GAN
Figure 2 for Lip to Speech Synthesis with Visual Context Attentional GAN
Figure 3 for Lip to Speech Synthesis with Visual Context Attentional GAN
Figure 4 for Lip to Speech Synthesis with Visual Context Attentional GAN
Viaarxiv icon

Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video

Add code
Bookmark button
Alert button
Apr 04, 2022
Minsu Kim, Joanna Hong, Se Jin Park, Yong Man Ro

Figure 1 for Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video
Figure 2 for Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video
Figure 3 for Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video
Figure 4 for Multi-modality Associative Bridging through Memory: Speech Sound Recollected from Face Video
Viaarxiv icon