Picture for Buye Xu

Buye Xu

Improving Resource-Efficient Speech Enhancement via Neural Differentiable DSP Vocoder Refinement

Add code
Aug 20, 2025
Viaarxiv icon

A Novel Deep Learning Framework for Efficient Multichannel Acoustic Feedback Control

Add code
May 21, 2025
Viaarxiv icon

Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment

Add code
Jan 30, 2025
Figure 1 for Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment
Figure 2 for Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment
Figure 3 for Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment
Figure 4 for Efficient Audiovisual Speech Processing via MUTUD: Multimodal Training and Unimodal Deployment
Viaarxiv icon

Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement

Add code
Nov 04, 2024
Figure 1 for Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Figure 2 for Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Figure 3 for Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Figure 4 for Modulating State Space Model with SlowFast Framework for Compute-Efficient Ultra Low-Latency Speech Enhancement
Viaarxiv icon

Dynamic Gated Recurrent Neural Network for Compute-efficient Speech Enhancement

Add code
Aug 22, 2024
Viaarxiv icon

FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses

Add code
Aug 12, 2024
Figure 1 for FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses
Figure 2 for FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses
Figure 3 for FoVNet: Configurable Field-of-View Speech Enhancement with Low Computation and Distortion for Smart Glasses
Viaarxiv icon

All Neural Low-latency Directional Speech Extraction

Add code
Jul 05, 2024
Figure 1 for All Neural Low-latency Directional Speech Extraction
Figure 2 for All Neural Low-latency Directional Speech Extraction
Figure 3 for All Neural Low-latency Directional Speech Extraction
Figure 4 for All Neural Low-latency Directional Speech Extraction
Viaarxiv icon

AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling

Add code
Jun 17, 2024
Figure 1 for AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling
Figure 2 for AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling
Figure 3 for AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling
Figure 4 for AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling
Viaarxiv icon

A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement

Add code
Mar 03, 2024
Figure 1 for A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement
Figure 2 for A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement
Figure 3 for A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement
Figure 4 for A Closer Look at Wav2Vec2 Embeddings for On-Device Single-Channel Speech Enhancement
Viaarxiv icon

Decoupled Spatial and Temporal Processing for Resource Efficient Multichannel Speech Enhancement

Add code
Jan 15, 2024
Viaarxiv icon