Picture for Jianhua Tao

Jianhua Tao

WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification

Add code
Sep 18, 2024
Figure 1 for WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification
Figure 2 for WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification
Figure 3 for WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification
Figure 4 for WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification
Viaarxiv icon

DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech

Add code
Sep 18, 2024
Figure 1 for DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech
Figure 2 for DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech
Figure 3 for DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech
Figure 4 for DPI-TTS: Directional Patch Interaction for Fast-Converging and Style Temporal Modeling in Text-to-Speech
Viaarxiv icon

Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation

Add code
Sep 14, 2024
Figure 1 for Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation
Figure 2 for Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation
Figure 3 for Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation
Figure 4 for Text Prompt is Not Enough: Sound Event Enhanced Prompt Adapter for Target Style Audio Generation
Viaarxiv icon

Utilizing Speaker Profiles for Impersonation Audio Detection

Add code
Aug 30, 2024
Figure 1 for Utilizing Speaker Profiles for Impersonation Audio Detection
Figure 2 for Utilizing Speaker Profiles for Impersonation Audio Detection
Figure 3 for Utilizing Speaker Profiles for Impersonation Audio Detection
Figure 4 for Utilizing Speaker Profiles for Impersonation Audio Detection
Viaarxiv icon

Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models

Add code
Aug 24, 2024
Figure 1 for Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models
Figure 2 for Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models
Figure 3 for Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models
Figure 4 for Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models
Viaarxiv icon

Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?

Add code
Aug 20, 2024
Figure 1 for Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?
Figure 2 for Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?
Figure 3 for Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?
Figure 4 for Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?
Viaarxiv icon

EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech

Add code
Aug 20, 2024
Figure 1 for EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech
Figure 2 for EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech
Figure 3 for EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech
Figure 4 for EELE: Exploring Efficient and Extensible LoRA Integration in Emotional Text-to-Speech
Viaarxiv icon

A Noval Feature via Color Quantisation for Fake Audio Detection

Add code
Aug 20, 2024
Figure 1 for A Noval Feature via Color Quantisation for Fake Audio Detection
Figure 2 for A Noval Feature via Color Quantisation for Fake Audio Detection
Figure 3 for A Noval Feature via Color Quantisation for Fake Audio Detection
Figure 4 for A Noval Feature via Color Quantisation for Fake Audio Detection
Viaarxiv icon

VQ-CTAP: Cross-Modal Fine-Grained Sequence Representation Learning for Speech Processing

Add code
Aug 11, 2024
Viaarxiv icon

ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild

Add code
Aug 09, 2024
Figure 1 for ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild
Figure 2 for ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild
Figure 3 for ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild
Figure 4 for ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild
Viaarxiv icon