Alert button
Picture for Alexander Richard

Alexander Richard

Alert button

ScoreDec: A Phase-preserving High-Fidelity Audio Codec with A Generalized Score-based Diffusion Post-filter

Jan 22, 2024
Yi-Chiao Wu, Dejan Marković, Steven Krenn, Israel D. Gebru, Alexander Richard

Viaarxiv icon

From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Jan 03, 2024
Evonne Ng, Javier Romero, Timur Bagautdinov, Shaojie Bai, Trevor Darrell, Angjoo Kanazawa, Alexander Richard

Viaarxiv icon

Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio

Nov 01, 2023
Xudong Xu, Dejan Markovic, Jacob Sandakly, Todd Keebler, Steven Krenn, Alexander Richard

Figure 1 for Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio
Figure 2 for Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio
Figure 3 for Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio
Figure 4 for Sounding Bodies: Modeling 3D Spatial Sound of Humans Using Body Pose and Audio
Viaarxiv icon

AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec

May 26, 2023
Yi-Chiao Wu, Israel D. Gebru, Dejan Marković, Alexander Richard

Figure 1 for AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec
Figure 2 for AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec
Figure 3 for AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec
Figure 4 for AudioDec: An Open-source Streaming High-fidelity Neural Audio Codec
Viaarxiv icon

Novel-View Acoustic Synthesis

Jan 23, 2023
Changan Chen, Alexander Richard, Roman Shapovalov, Vamsi Krishna Ithapu, Natalia Neverova, Kristen Grauman, Andrea Vedaldi

Figure 1 for Novel-View Acoustic Synthesis
Figure 2 for Novel-View Acoustic Synthesis
Figure 3 for Novel-View Acoustic Synthesis
Figure 4 for Novel-View Acoustic Synthesis
Viaarxiv icon

Multiface: A Dataset for Neural Face Rendering

Jul 22, 2022
Cheng-hsin Wuu, Ningyuan Zheng, Scott Ardisson, Rohan Bali, Danielle Belko, Eric Brockmeyer, Lucas Evans, Timothy Godisart, Hyowon Ha, Alexander Hypes, Taylor Koska, Steven Krenn, Stephen Lombardi, Xiaomin Luo, Kevyn McPhail, Laura Millerschoen, Michal Perdoch, Mark Pitts, Alexander Richard, Jason Saragih, Junko Saragih, Takaaki Shiratori, Tomas Simon, Matt Stewart, Autumn Trimble, Xinshuo Weng, David Whitewolf, Chenglei Wu, Shoou-I Yu, Yaser Sheikh

Figure 1 for Multiface: A Dataset for Neural Face Rendering
Figure 2 for Multiface: A Dataset for Neural Face Rendering
Figure 3 for Multiface: A Dataset for Neural Face Rendering
Figure 4 for Multiface: A Dataset for Neural Face Rendering
Viaarxiv icon

End-to-End Binaural Speech Synthesis

Jul 08, 2022
Wen Chin Huang, Dejan Markovic, Alexander Richard, Israel Dejene Gebru, Anjali Menon

Figure 1 for End-to-End Binaural Speech Synthesis
Figure 2 for End-to-End Binaural Speech Synthesis
Figure 3 for End-to-End Binaural Speech Synthesis
Figure 4 for End-to-End Binaural Speech Synthesis
Viaarxiv icon

Implicit Neural Spatial Filtering for Multichannel Source Separation in the Waveform Domain

Jun 30, 2022
Dejan Markovic, Alexandre Defossez, Alexander Richard

Figure 1 for Implicit Neural Spatial Filtering for Multichannel Source Separation in the Waveform Domain
Figure 2 for Implicit Neural Spatial Filtering for Multichannel Source Separation in the Waveform Domain
Figure 3 for Implicit Neural Spatial Filtering for Multichannel Source Separation in the Waveform Domain
Figure 4 for Implicit Neural Spatial Filtering for Multichannel Source Separation in the Waveform Domain
Viaarxiv icon

Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis

Mar 31, 2022
Karren Yang, Dejan Markovic, Steven Krenn, Vasu Agrawal, Alexander Richard

Figure 1 for Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Figure 2 for Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Figure 3 for Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Figure 4 for Audio-Visual Speech Codecs: Rethinking Audio-Visual Speech Enhancement by Re-Synthesis
Viaarxiv icon

LiP-Flow: Learning Inference-time Priors for Codec Avatars via Normalizing Flows in Latent Space

Mar 15, 2022
Emre Aksan, Shugao Ma, Akin Caliskan, Stanislav Pidhorskyi, Alexander Richard, Shih-En Wei, Jason Saragih, Otmar Hilliges

Figure 1 for LiP-Flow: Learning Inference-time Priors for Codec Avatars via Normalizing Flows in Latent Space
Figure 2 for LiP-Flow: Learning Inference-time Priors for Codec Avatars via Normalizing Flows in Latent Space
Figure 3 for LiP-Flow: Learning Inference-time Priors for Codec Avatars via Normalizing Flows in Latent Space
Figure 4 for LiP-Flow: Learning Inference-time Priors for Codec Avatars via Normalizing Flows in Latent Space
Viaarxiv icon