Picture for Haizhou Li

Haizhou Li

Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization

Add code
Jul 25, 2024
Figure 1 for Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization
Figure 2 for Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization
Figure 3 for Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization
Figure 4 for Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization
Viaarxiv icon

Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning

Add code
Jul 21, 2024
Figure 1 for Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Figure 2 for Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Figure 3 for Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Figure 4 for Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning
Viaarxiv icon

GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis

Add code
Jul 15, 2024
Figure 1 for GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis
Figure 2 for GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis
Figure 3 for GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis
Figure 4 for GROOT: Generating Robust Watermark for Diffusion-Model-Based Audio Synthesis
Viaarxiv icon

SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech

Add code
Jul 03, 2024
Figure 1 for SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech
Figure 2 for SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech
Figure 3 for SA-WavLM: Speaker-Aware Self-Supervised Pre-training for Mixture Speech
Viaarxiv icon

Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset

Add code
Jul 03, 2024
Figure 1 for Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset
Figure 2 for Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset
Figure 3 for Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset
Figure 4 for Emotion and Intent Joint Understanding in Multimodal Conversation: A Benchmarking Dataset
Viaarxiv icon

DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models

Add code
Jul 01, 2024
Figure 1 for DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models
Figure 2 for DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models
Figure 3 for DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models
Figure 4 for DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models
Viaarxiv icon

RefXVC: Cross-Lingual Voice Conversion with Enhanced Reference Leveraging

Add code
Jun 24, 2024
Figure 1 for RefXVC: Cross-Lingual Voice Conversion with Enhanced Reference Leveraging
Figure 2 for RefXVC: Cross-Lingual Voice Conversion with Enhanced Reference Leveraging
Figure 3 for RefXVC: Cross-Lingual Voice Conversion with Enhanced Reference Leveraging
Figure 4 for RefXVC: Cross-Lingual Voice Conversion with Enhanced Reference Leveraging
Viaarxiv icon

Take the essence and discard the dross: A Rethinking on Data Selection for Fine-Tuning Large Language Models

Add code
Jun 20, 2024
Viaarxiv icon

SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words

Add code
Jun 19, 2024
Figure 1 for SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
Figure 2 for SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
Figure 3 for SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
Figure 4 for SD-Eval: A Benchmark Dataset for Spoken Dialogue Understanding Beyond Words
Viaarxiv icon

An Exploration of Length Generalization in Transformer-Based Speech Enhancement

Add code
Jun 17, 2024
Figure 1 for An Exploration of Length Generalization in Transformer-Based Speech Enhancement
Figure 2 for An Exploration of Length Generalization in Transformer-Based Speech Enhancement
Figure 3 for An Exploration of Length Generalization in Transformer-Based Speech Enhancement
Figure 4 for An Exploration of Length Generalization in Transformer-Based Speech Enhancement
Viaarxiv icon