Picture for Gordon Wichern

Gordon Wichern

Enhanced Reverberation as Supervision for Unsupervised Speech Separation

Add code
Aug 06, 2024
Figure 1 for Enhanced Reverberation as Supervision for Unsupervised Speech Separation
Figure 2 for Enhanced Reverberation as Supervision for Unsupervised Speech Separation
Figure 3 for Enhanced Reverberation as Supervision for Unsupervised Speech Separation
Figure 4 for Enhanced Reverberation as Supervision for Unsupervised Speech Separation
Viaarxiv icon

TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement

Add code
Aug 06, 2024
Figure 1 for TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
Figure 2 for TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
Figure 3 for TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
Figure 4 for TF-Locoformer: Transformer with Local Modeling by Convolution for Speech Separation and Enhancement
Viaarxiv icon

Sound Event Bounding Boxes

Add code
Jun 06, 2024
Figure 1 for Sound Event Bounding Boxes
Figure 2 for Sound Event Bounding Boxes
Figure 3 for Sound Event Bounding Boxes
Figure 4 for Sound Event Bounding Boxes
Viaarxiv icon

SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers

Add code
Apr 02, 2024
Figure 1 for SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers
Figure 2 for SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers
Figure 3 for SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers
Figure 4 for SMITIN: Self-Monitored Inference-Time INtervention for Generative Music Transformers
Viaarxiv icon

Why does music source separation benefit from cacophony?

Add code
Feb 28, 2024
Figure 1 for Why does music source separation benefit from cacophony?
Figure 2 for Why does music source separation benefit from cacophony?
Figure 3 for Why does music source separation benefit from cacophony?
Figure 4 for Why does music source separation benefit from cacophony?
Viaarxiv icon

NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization

Add code
Feb 27, 2024
Viaarxiv icon

NeuroHeed+: Improving Neuro-steered Speaker Extraction with Joint Auditory Attention Detection

Add code
Dec 12, 2023
Figure 1 for NeuroHeed+: Improving Neuro-steered Speaker Extraction with Joint Auditory Attention Detection
Figure 2 for NeuroHeed+: Improving Neuro-steered Speaker Extraction with Joint Auditory Attention Detection
Figure 3 for NeuroHeed+: Improving Neuro-steered Speaker Extraction with Joint Auditory Attention Detection
Figure 4 for NeuroHeed+: Improving Neuro-steered Speaker Extraction with Joint Auditory Attention Detection
Viaarxiv icon

Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction

Add code
Oct 30, 2023
Figure 1 for Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction
Figure 2 for Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction
Figure 3 for Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction
Figure 4 for Scenario-Aware Audio-Visual TF-GridNet for Target Speech Extraction
Viaarxiv icon

Generation or Replication: Auscultating Audio Latent Diffusion Models

Add code
Oct 16, 2023
Figure 1 for Generation or Replication: Auscultating Audio Latent Diffusion Models
Figure 2 for Generation or Replication: Auscultating Audio Latent Diffusion Models
Figure 3 for Generation or Replication: Auscultating Audio Latent Diffusion Models
Figure 4 for Generation or Replication: Auscultating Audio Latent Diffusion Models
Viaarxiv icon

Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation

Add code
Sep 29, 2023
Figure 1 for Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Figure 2 for Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Figure 3 for Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Figure 4 for Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation
Viaarxiv icon