Alert button
Picture for Anurag Kumar

Anurag Kumar

Alert button

The impact of removing head movements on audio-visual speech enhancement

Feb 02, 2022
Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda, Jacob Donley, Anurag Kumar

Figure 1 for The impact of removing head movements on audio-visual speech enhancement
Figure 2 for The impact of removing head movements on audio-visual speech enhancement
Figure 3 for The impact of removing head movements on audio-visual speech enhancement
Figure 4 for The impact of removing head movements on audio-visual speech enhancement
Viaarxiv icon

NICE-Beam: Neural Integrated Covariance Estimators for Time-Varying Beamformers

Dec 08, 2021
Jonah Casebeer, Jacob Donley, Daniel Wong, Buye Xu, Anurag Kumar

Figure 1 for NICE-Beam: Neural Integrated Covariance Estimators for Time-Varying Beamformers
Figure 2 for NICE-Beam: Neural Integrated Covariance Estimators for Time-Varying Beamformers
Figure 3 for NICE-Beam: Neural Integrated Covariance Estimators for Time-Varying Beamformers
Figure 4 for NICE-Beam: Neural Integrated Covariance Estimators for Time-Varying Beamformers
Viaarxiv icon

Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks

Nov 10, 2021
Sangeeta Srivastava, Yun Wang, Andros Tjandra, Anurag Kumar, Chunxi Liu, Kritika Singh, Yatharth Saraf

Figure 1 for Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks
Figure 2 for Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks
Figure 3 for Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks
Figure 4 for Conformer-Based Self-Supervised Learning for Non-Speech Audio Tasks
Viaarxiv icon

Multichannel Speech Enhancement without Beamforming

Oct 25, 2021
Asutosh Pandey, Buye Xu, Anurag Kumar, Jacob Donley, Paul Calamia, DeLiang Wang

Figure 1 for Multichannel Speech Enhancement without Beamforming
Figure 2 for Multichannel Speech Enhancement without Beamforming
Figure 3 for Multichannel Speech Enhancement without Beamforming
Figure 4 for Multichannel Speech Enhancement without Beamforming
Viaarxiv icon

TADRN: Triple-Attentive Dual-Recurrent Network for Ad-hoc Array Multichannel Speech Enhancement

Oct 22, 2021
Ashutosh Pandey, Buye Xu, Anurag Kumar, Jacob Donley, Paul Calamia, DeLiang Wang

Figure 1 for TADRN: Triple-Attentive Dual-Recurrent Network for Ad-hoc Array Multichannel Speech Enhancement
Figure 2 for TADRN: Triple-Attentive Dual-Recurrent Network for Ad-hoc Array Multichannel Speech Enhancement
Figure 3 for TADRN: Triple-Attentive Dual-Recurrent Network for Ad-hoc Array Multichannel Speech Enhancement
Figure 4 for TADRN: Triple-Attentive Dual-Recurrent Network for Ad-hoc Array Multichannel Speech Enhancement
Viaarxiv icon

TPARN: Triple-path Attentive Recurrent Network for Time-domain Multichannel Speech Enhancement

Oct 20, 2021
Ashutosh Pandey, Buye Xu, Anurag Kumar, Jacob Donley, Paul Calamia, DeLiang Wang

Figure 1 for TPARN: Triple-path Attentive Recurrent Network for Time-domain Multichannel Speech Enhancement
Figure 2 for TPARN: Triple-path Attentive Recurrent Network for Time-domain Multichannel Speech Enhancement
Figure 3 for TPARN: Triple-path Attentive Recurrent Network for Time-domain Multichannel Speech Enhancement
Figure 4 for TPARN: Triple-path Attentive Recurrent Network for Time-domain Multichannel Speech Enhancement
Viaarxiv icon

Continual self-training with bootstrapped remixing for speech enhancement

Oct 19, 2021
Efthymios Tzinis, Yossi Adi, Vamsi K. Ithapu, Buye Xu, Anurag Kumar

Figure 1 for Continual self-training with bootstrapped remixing for speech enhancement
Figure 2 for Continual self-training with bootstrapped remixing for speech enhancement
Figure 3 for Continual self-training with bootstrapped remixing for speech enhancement
Viaarxiv icon

Ego4D: Around the World in 3,000 Hours of Egocentric Video

Oct 13, 2021
Kristen Grauman, Andrew Westbury, Eugene Byrne, Zachary Chavis, Antonino Furnari, Rohit Girdhar, Jackson Hamburger, Hao Jiang, Miao Liu, Xingyu Liu, Miguel Martin, Tushar Nagarajan, Ilija Radosavovic, Santhosh Kumar Ramakrishnan, Fiona Ryan, Jayant Sharma, Michael Wray, Mengmeng Xu, Eric Zhongcong Xu, Chen Zhao, Siddhant Bansal, Dhruv Batra, Vincent Cartillier, Sean Crane, Tien Do, Morrie Doulaty, Akshay Erapalli, Christoph Feichtenhofer, Adriano Fragomeni, Qichen Fu, Christian Fuegen, Abrham Gebreselasie, Cristina Gonzalez, James Hillis, Xuhua Huang, Yifei Huang, Wenqi Jia, Weslie Khoo, Jachym Kolar, Satwik Kottur, Anurag Kumar, Federico Landini, Chao Li, Yanghao Li, Zhenqiang Li, Karttikeya Mangalam, Raghava Modhugu, Jonathan Munro, Tullie Murrell, Takumi Nishiyasu, Will Price, Paola Ruiz Puentes, Merey Ramazanova, Leda Sari, Kiran Somasundaram, Audrey Southerland, Yusuke Sugano, Ruijie Tao, Minh Vo, Yuchen Wang, Xindi Wu, Takuma Yagi, Yunyi Zhu, Pablo Arbelaez, David Crandall, Dima Damen, Giovanni Maria Farinella, Bernard Ghanem, Vamsi Krishna Ithapu, C. V. Jawahar, Hanbyul Joo, Kris Kitani, Haizhou Li, Richard Newcombe, Aude Oliva, Hyun Soo Park, James M. Rehg, Yoichi Sato, Jianbo Shi, Mike Zheng Shou, Antonio Torralba, Lorenzo Torresani, Mingfei Yan, Jitendra Malik

Figure 1 for Ego4D: Around the World in 3,000 Hours of Egocentric Video
Figure 2 for Ego4D: Around the World in 3,000 Hours of Egocentric Video
Figure 3 for Ego4D: Around the World in 3,000 Hours of Egocentric Video
Figure 4 for Ego4D: Around the World in 3,000 Hours of Egocentric Video
Viaarxiv icon

Incorporating Real-world Noisy Speech in Neural-network-based Speech Enhancement Systems

Sep 21, 2021
Yangyang Xia, Buye Xu, Anurag Kumar

Figure 1 for Incorporating Real-world Noisy Speech in Neural-network-based Speech Enhancement Systems
Figure 2 for Incorporating Real-world Noisy Speech in Neural-network-based Speech Enhancement Systems
Figure 3 for Incorporating Real-world Noisy Speech in Neural-network-based Speech Enhancement Systems
Figure 4 for Incorporating Real-world Noisy Speech in Neural-network-based Speech Enhancement Systems
Viaarxiv icon