Alert button
Picture for Jeong Hun Yeo

Jeong Hun Yeo

Alert button

Where Visual Speech Meets Language: VSP-LLM Framework for Efficient and Context-Aware Visual Speech Processing

Add code
Bookmark button
Alert button
Feb 23, 2024
Jeong Hun Yeo, Seunghee Han, Minsu Kim, Yong Man Ro

Viaarxiv icon

Multilingual Visual Speech Recognition with a Single Model by Learning with Discrete Visual Speech Units

Add code
Bookmark button
Alert button
Jan 18, 2024
Minsu Kim, Jeong Hun Yeo, Jeongsoo Choi, Se Jin Park, Yong Man Ro

Viaarxiv icon

Visual Speech Recognition for Low-resource Languages with Automatic Labels From Whisper Model

Add code
Bookmark button
Alert button
Sep 15, 2023
Jeong Hun Yeo, Minsu Kim, Shinji Watanabe, Yong Man Ro

Figure 1 for Visual Speech Recognition for Low-resource Languages with Automatic Labels From Whisper Model
Figure 2 for Visual Speech Recognition for Low-resource Languages with Automatic Labels From Whisper Model
Figure 3 for Visual Speech Recognition for Low-resource Languages with Automatic Labels From Whisper Model
Figure 4 for Visual Speech Recognition for Low-resource Languages with Automatic Labels From Whisper Model
Viaarxiv icon

Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens

Add code
Bookmark button
Alert button
Sep 15, 2023
Minsu Kim, Jeongsoo Choi, Soumi Maiti, Jeong Hun Yeo, Shinji Watanabe, Yong Man Ro

Figure 1 for Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens
Figure 2 for Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens
Figure 3 for Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens
Figure 4 for Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-training and Multi-modal Tokens
Viaarxiv icon

Lip Reading for Low-resource Languages by Learning and Combining General Speech Knowledge and Language-specific Knowledge

Add code
Bookmark button
Alert button
Aug 18, 2023
Minsu Kim, Jeong Hun Yeo, Jeongsoo Choi, Yong Man Ro

Viaarxiv icon

AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model

Add code
Bookmark button
Alert button
Aug 15, 2023
Jeong Hun Yeo, Minsu Kim, Jeongsoo Choi, Dae Hoe Kim, Yong Man Ro

Figure 1 for AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Figure 2 for AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Figure 3 for AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Figure 4 for AKVSR: Audio Knowledge Empowered Visual Speech Recognition by Compressing Audio Knowledge of a Pretrained Model
Viaarxiv icon

Multi-Temporal Lip-Audio Memory for Visual Speech Recognition

Add code
Bookmark button
Alert button
May 08, 2023
Jeong Hun Yeo, Minsu Kim, Yong Man Ro

Figure 1 for Multi-Temporal Lip-Audio Memory for Visual Speech Recognition
Figure 2 for Multi-Temporal Lip-Audio Memory for Visual Speech Recognition
Figure 3 for Multi-Temporal Lip-Audio Memory for Visual Speech Recognition
Figure 4 for Multi-Temporal Lip-Audio Memory for Visual Speech Recognition
Viaarxiv icon

Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading

Add code
Bookmark button
Alert button
Apr 04, 2022
Minsu Kim, Jeong Hun Yeo, Yong Man Ro

Figure 1 for Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading
Figure 2 for Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading
Figure 3 for Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading
Figure 4 for Distinguishing Homophenes Using Multi-Head Visual-Audio Memory for Lip Reading
Viaarxiv icon