Picture for Satoshi Suzuki

Satoshi Suzuki

Few-shot Personalization via In-Context Learning for Speech Emotion Recognition based on Speech-Language Model

Add code
Sep 10, 2025
Viaarxiv icon

Test-time Adaptation Meets Image Enhancement: Improving Accuracy via Uncertainty-aware Logit Switching

Add code
Mar 26, 2024
Figure 1 for Test-time Adaptation Meets Image Enhancement: Improving Accuracy via Uncertainty-aware Logit Switching
Figure 2 for Test-time Adaptation Meets Image Enhancement: Improving Accuracy via Uncertainty-aware Logit Switching
Figure 3 for Test-time Adaptation Meets Image Enhancement: Improving Accuracy via Uncertainty-aware Logit Switching
Figure 4 for Test-time Adaptation Meets Image Enhancement: Improving Accuracy via Uncertainty-aware Logit Switching
Viaarxiv icon

Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff

Add code
Aug 31, 2023
Figure 1 for Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff
Figure 2 for Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff
Figure 3 for Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff
Figure 4 for Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff
Viaarxiv icon

End-to-End Joint Target and Non-Target Speakers ASR

Add code
Jun 04, 2023
Figure 1 for End-to-End Joint Target and Non-Target Speakers ASR
Figure 2 for End-to-End Joint Target and Non-Target Speakers ASR
Figure 3 for End-to-End Joint Target and Non-Target Speakers ASR
Viaarxiv icon

On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis

Add code
Oct 28, 2022
Figure 1 for On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis
Figure 2 for On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis
Figure 3 for On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis
Figure 4 for On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis
Viaarxiv icon

Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data

Add code
Jul 11, 2022
Figure 1 for Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data
Figure 2 for Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data
Figure 3 for Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data
Viaarxiv icon