Picture for Ruibo Fu

Ruibo Fu

Towards Diverse and Efficient Audio Captioning via Diffusion Models

Add code
Sep 14, 2024
Viaarxiv icon

A Noval Feature via Color Quantisation for Fake Audio Detection

Add code
Aug 20, 2024
Figure 1 for A Noval Feature via Color Quantisation for Fake Audio Detection
Figure 2 for A Noval Feature via Color Quantisation for Fake Audio Detection
Figure 3 for A Noval Feature via Color Quantisation for Fake Audio Detection
Figure 4 for A Noval Feature via Color Quantisation for Fake Audio Detection
Viaarxiv icon

EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech

Add code
Aug 20, 2024
Figure 1 for EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech
Figure 2 for EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech
Figure 3 for EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech
Figure 4 for EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech
Viaarxiv icon

Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?

Add code
Aug 20, 2024
Figure 1 for Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?
Figure 2 for Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?
Figure 3 for Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?
Figure 4 for Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?
Viaarxiv icon

Temporal Variability and Multi-Viewed Self-Supervised Representations to Tackle the ASVspoof5 Deepfake Challenge

Add code
Aug 13, 2024
Figure 1 for Temporal Variability and Multi-Viewed Self-Supervised Representations to Tackle the ASVspoof5 Deepfake Challenge
Figure 2 for Temporal Variability and Multi-Viewed Self-Supervised Representations to Tackle the ASVspoof5 Deepfake Challenge
Figure 3 for Temporal Variability and Multi-Viewed Self-Supervised Representations to Tackle the ASVspoof5 Deepfake Challenge
Figure 4 for Temporal Variability and Multi-Viewed Self-Supervised Representations to Tackle the ASVspoof5 Deepfake Challenge
Viaarxiv icon

VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing

Add code
Aug 11, 2024
Viaarxiv icon

MDPE: A Multimodal Deception Dataset with Personality and Emotional Characteristics

Add code
Jul 17, 2024
Figure 1 for MDPE: A Multimodal Deception Dataset with Personality and Emotional Characteristics
Figure 2 for MDPE: A Multimodal Deception Dataset with Personality and Emotional Characteristics
Figure 3 for MDPE: A Multimodal Deception Dataset with Personality and Emotional Characteristics
Figure 4 for MDPE: A Multimodal Deception Dataset with Personality and Emotional Characteristics
Viaarxiv icon

ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation

Add code
Jul 07, 2024
Figure 1 for ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation
Figure 2 for ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation
Figure 3 for ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation
Figure 4 for ASRRL-TTS: Agile Speaker Representation Reinforcement Learning for Text-to-Speech Speaker Adaptation
Viaarxiv icon

Fake News Detection and Manipulation Reasoning via Large Vision-Language Models

Add code
Jul 02, 2024
Viaarxiv icon

A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge

Add code
Jun 22, 2024
Figure 1 for A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge
Figure 2 for A multi-speaker multi-lingual voice cloning system based on vits2 for limmits 2024 challenge
Viaarxiv icon