Picture for Szu-Wei Fu

Szu-Wei Fu

Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations

Add code
Aug 25, 2025
Figure 1 for Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations
Figure 2 for Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations
Figure 3 for Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations
Figure 4 for Test-Time Scaling Strategies for Generative Retrieval in Multimodal Conversational Recommendations
Viaarxiv icon

DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment

Add code
Jul 03, 2025
Viaarxiv icon

Universal Speech Enhancement with Regression and Generative Mamba

Add code
May 27, 2025
Figure 1 for Universal Speech Enhancement with Regression and Generative Mamba
Figure 2 for Universal Speech Enhancement with Regression and Generative Mamba
Figure 3 for Universal Speech Enhancement with Regression and Generative Mamba
Figure 4 for Universal Speech Enhancement with Regression and Generative Mamba
Viaarxiv icon

Linguistic Knowledge Transfer Learning for Speech Enhancement

Add code
Mar 10, 2025
Viaarxiv icon

Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits

Add code
Jan 07, 2025
Figure 1 for Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits
Figure 2 for Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits
Figure 3 for Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits
Figure 4 for Detecting the Undetectable: Assessing the Efficacy of Current Spoof Detection Methods Against Seamless Speech Edits
Viaarxiv icon

NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts

Add code
Nov 08, 2024
Figure 1 for NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts
Figure 2 for NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts
Figure 3 for NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts
Figure 4 for NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts
Viaarxiv icon

RankUp: Boosting Semi-Supervised Regression with an Auxiliary Ranking Classifier

Add code
Oct 29, 2024
Figure 1 for RankUp: Boosting Semi-Supervised Regression with an Auxiliary Ranking Classifier
Figure 2 for RankUp: Boosting Semi-Supervised Regression with an Auxiliary Ranking Classifier
Figure 3 for RankUp: Boosting Semi-Supervised Regression with an Auxiliary Ranking Classifier
Figure 4 for RankUp: Boosting Semi-Supervised Regression with an Auxiliary Ranking Classifier
Viaarxiv icon

Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data

Add code
Sep 30, 2024
Figure 1 for Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Figure 2 for Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Figure 3 for Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Figure 4 for Developing Instruction-Following Speech Language Model Without Speech Instruction-Tuning Data
Viaarxiv icon

Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration

Add code
Sep 25, 2024
Figure 1 for Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration
Figure 2 for Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration
Figure 3 for Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration
Figure 4 for Generative Speech Foundation Model Pretraining for High-Quality Speech Extraction and Restoration
Viaarxiv icon

The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction

Add code
Sep 11, 2024
Figure 1 for The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
Figure 2 for The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
Figure 3 for The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
Figure 4 for The VoiceMOS Challenge 2024: Beyond Speech Quality Prediction
Viaarxiv icon