Picture for Rui Liu

Rui Liu

NE-PADD: Leveraging Named Entity Knowledge for Robust Partial Audio Deepfake Detection via Attention Aggregation

Add code
Sep 04, 2025
Figure 1 for NE-PADD: Leveraging Named Entity Knowledge for Robust Partial Audio Deepfake Detection via Attention Aggregation
Figure 2 for NE-PADD: Leveraging Named Entity Knowledge for Robust Partial Audio Deepfake Detection via Attention Aggregation
Figure 3 for NE-PADD: Leveraging Named Entity Knowledge for Robust Partial Audio Deepfake Detection via Attention Aggregation
Figure 4 for NE-PADD: Leveraging Named Entity Knowledge for Robust Partial Audio Deepfake Detection via Attention Aggregation
Viaarxiv icon

Self-Rewarding Vision-Language Model via Reasoning Decomposition

Add code
Aug 27, 2025
Figure 1 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 2 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 3 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Figure 4 for Self-Rewarding Vision-Language Model via Reasoning Decomposition
Viaarxiv icon

UniTalker: Conversational Speech-Visual Synthesis

Add code
Aug 06, 2025
Viaarxiv icon

Scene-aware SAR ship detection guided by unsupervised sea-land segmentation

Add code
Jun 15, 2025
Viaarxiv icon

Ming-Omni: A Unified Multimodal Model for Perception and Generation

Add code
Jun 11, 2025
Figure 1 for Ming-Omni: A Unified Multimodal Model for Perception and Generation
Figure 2 for Ming-Omni: A Unified Multimodal Model for Perception and Generation
Figure 3 for Ming-Omni: A Unified Multimodal Model for Perception and Generation
Figure 4 for Ming-Omni: A Unified Multimodal Model for Perception and Generation
Viaarxiv icon

Towards Emotionally Consistent Text-Based Speech Editing: Introducing EmoCorrector and The ECD-TSE Dataset

Add code
May 24, 2025
Figure 1 for Towards Emotionally Consistent Text-Based Speech Editing: Introducing EmoCorrector and The ECD-TSE Dataset
Figure 2 for Towards Emotionally Consistent Text-Based Speech Editing: Introducing EmoCorrector and The ECD-TSE Dataset
Figure 3 for Towards Emotionally Consistent Text-Based Speech Editing: Introducing EmoCorrector and The ECD-TSE Dataset
Figure 4 for Towards Emotionally Consistent Text-Based Speech Editing: Introducing EmoCorrector and The ECD-TSE Dataset
Viaarxiv icon

R1-Compress: Long Chain-of-Thought Compression via Chunk Compression and Search

Add code
May 22, 2025
Viaarxiv icon

Chain-Talker: Chain Understanding and Rendering for Empathetic Conversational Speech Synthesis

Add code
May 19, 2025
Viaarxiv icon

Continuous Optimization for Feature Selection with Permutation-Invariant Embedding and Policy-Guided Search

Add code
May 16, 2025
Viaarxiv icon

Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction

Add code
May 05, 2025
Figure 1 for Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Figure 2 for Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Figure 3 for Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Figure 4 for Ming-Lite-Uni: Advancements in Unified Architecture for Natural Multimodal Interaction
Viaarxiv icon