Picture for Chung-Cheng Chiu

Chung-Cheng Chiu

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation

Add code
Feb 20, 2024
Viaarxiv icon

Gemini: A Family of Highly Capable Multimodal Models

Add code
Dec 19, 2023
Viaarxiv icon

SLM: Bridge the thin gap between speech and text foundation models

Add code
Sep 30, 2023
Figure 1 for SLM: Bridge the thin gap between speech and text foundation models
Figure 2 for SLM: Bridge the thin gap between speech and text foundation models
Figure 3 for SLM: Bridge the thin gap between speech and text foundation models
Figure 4 for SLM: Bridge the thin gap between speech and text foundation models
Viaarxiv icon

Efficient Adapters for Giant Speech Models

Add code
Jun 13, 2023
Figure 1 for Efficient Adapters for Giant Speech Models
Figure 2 for Efficient Adapters for Giant Speech Models
Figure 3 for Efficient Adapters for Giant Speech Models
Figure 4 for Efficient Adapters for Giant Speech Models
Viaarxiv icon

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

Add code
Mar 03, 2023
Figure 1 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 2 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 3 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Figure 4 for Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages
Viaarxiv icon

Textless Direct Speech-to-Speech Translation with Discrete Speech Representation

Add code
Oct 31, 2022
Figure 1 for Textless Direct Speech-to-Speech Translation with Discrete Speech Representation
Figure 2 for Textless Direct Speech-to-Speech Translation with Discrete Speech Representation
Figure 3 for Textless Direct Speech-to-Speech Translation with Discrete Speech Representation
Figure 4 for Textless Direct Speech-to-Speech Translation with Discrete Speech Representation
Viaarxiv icon

Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data

Add code
May 16, 2022
Figure 1 for Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data
Figure 2 for Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data
Figure 3 for Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data
Viaarxiv icon

Self-supervised Learning with Random-projection Quantizer for Speech Recognition

Add code
Feb 03, 2022
Figure 1 for Self-supervised Learning with Random-projection Quantizer for Speech Recognition
Figure 2 for Self-supervised Learning with Random-projection Quantizer for Speech Recognition
Figure 3 for Self-supervised Learning with Random-projection Quantizer for Speech Recognition
Figure 4 for Self-supervised Learning with Random-projection Quantizer for Speech Recognition
Viaarxiv icon

Cross-attention conformer for context modeling in speech enhancement for ASR

Add code
Oct 30, 2021
Figure 1 for Cross-attention conformer for context modeling in speech enhancement for ASR
Figure 2 for Cross-attention conformer for context modeling in speech enhancement for ASR
Figure 3 for Cross-attention conformer for context modeling in speech enhancement for ASR
Figure 4 for Cross-attention conformer for context modeling in speech enhancement for ASR
Viaarxiv icon